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I. INTRODUCTION 


The word “big” is, perhaps, one of the first to come to mind when 
considering coronaviruses. The nature of the coronavirus genome — 
nonsegmented, single-stranded, positive-sense RNA—is not remark- 
able, but its size, 27 to 32 kb, surely is when compared with other RNA 
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viruses. The coronavirus polymerase gene alone (20-22 kb) is about 
the same size as the whole of the picornavirus (~8 kb) and vesicular 
stomatitis virus (~11 kb) genomes added together. The gene encoding 
the large surface glycoprotein is up to 4.4 kb, encoding an imposing 
trimeric, highly glycosylated protein. This soars some 20 nm above 
the virion envelope, giving the virus the appearance—with a little 
imagination — of a crown or coronet (Latin corona, hence the name of 
the genus). 

Coronaviruses are responsible for a number of economically impor- 
tant diseases. Avian infectious bronchitis virus (IBV) was the first coro- 
navirus to be isolated, from the domestic fowl, and propagated in the 
1930s. In addition to respiratory disease, which can predispose chickens 
to possibly lethal secondary bacterial infections, some strains also cause 
nephritis (King and Cavanagh, 1991; Cook and Mockett, 1995). Porcine 
transmissible gastroenteritis virus (TGEV) causes devastating disease 
in newborn pigs, with mortality often approaching 100% (Enjuanes and 
van der Zeijst, 1995). Intriguingly, there are also naturally occurring 
mutants [i.e., porcine respiratory coronavirus (PRCV)] of TGEV which 
cause only mild respiratory disease and no enteritis. Several other 
coronaviruses also cause enteritis: bovine coronavirus (BCV), turkey 
coronavirus (TCV; bluecomb virus), feline coronavirus (FCV), canine 
coronavirus (CCV) and porcine epidemic diarrhea virus (PEDV), FCV 
may also cause feline infectious peritonitis. An FCV has been isolated 
from a cheetah and BCVs from wild sambar deer and waterbuck 
(Tsunemitsu et al., 1995). These BCVs caused enteritis when inoculated 
into domestic calves. Humans are known to suffer from two very differ- 
ent coronaviruses, human coronavirus (HCV) OC43 and HCV 229K, 
both of which are a cause of the common cold. There is evidence for 
the presence of coronaviruses in tissues taken from multiple sclerosis 
(MS) patients (reviewed by Cavanagh and Macnaughton, 1995). This 
inflammatory, demyelinating neurological disease is associated with 
autoreactive T lymphocytes sensitized to myelin components of the 
central nervous system. Recently, Talbot and colleagues (1996) have 
demonstrated that many CD4* T-cell lines derived from MS patients 
showed a human leukocyte antigen-(HLA)-DR-restricted, cross- 
reactive pattern of antigen activation after in vitro selection of either 
myelin basic protein or HCV-229E proteins, suggesting that molecular 
mimicry between HCV and myelin may be an immunopathological 
mechanism in MS. Other coronaviruses [some strains of murine hepati- 
tis virus (MHV) and porcine hemagglutinating encephalomyelitis virus 
(HEV)] are well-known causes of neurological diseases, and MHV has 
been studied for many years in this context (Dales and Anderson, 1995), 
although many MHV strains cause primarily hepatitis. 
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The 1970s and early 1980s was the period in which coronavirus 
virion proteins and nested-set arrangements of mRNAs were identified 
and the discontinuous nature of coronavirus transcription was initially 
demonstrated. The first published sequence of a coronavirus gene ap- 
peared in 1983, starting an era in which the whole of the genomes of four 
coronaviruses were cloned — in pieces — and sequenced. This decade has 
seen the manipulation of these clones, and of complementary DNAs 
(cDNAs) of defective-interfering (DI) RNAs, to study coronavirus RNA 
replication, transcription, recombination, processing and transport of 
proteins, virion assembly, identification of cell receptors for coronavi- 
ruses, and processing of the polymerase. 

This review is largely concerned with these areas. Some topics are 
notable by their absence, space not permitting their inclusion. For 
example, the elucidation of the molecular basis of the antigenic proper- 
ties of the large surface (spike) glycoprotein and its role in tissue tro- 
pism has been omitted. For these topics and all others both within and 
without the compass of our review for which a concurrently comprehen- 
sive and in-depth treatise is desired, the reader is referred to the book 
edited by Siddell (1995a). Individual chapters in that book will be 
referenced at the appropriate places in this review. 


II. TAXONOMY AND THE ESSENTIAL CHARACTERISTICS OF Coronaviridae 


All coronaviruses belong to one genus, Coronavirus, within the fam- 
ily Coronaviridae (Cavanagh et al., 1994, 1995). Initially, serological 
analysis was used to differentiate coronavirus species and showed that 
they could be divided into four antigenic groups (Holmes, 1990). The 
species and group divisions were subsequently refined by monoclonal 
antibody analysis and nucleotide sequencing, which revealed the close 
relatedness between TCV and BCV, resulting in the current classifica- 
tion of three antigenic groups (Table I). The same groupings emerge 
regardless of which structural protein sequences are compared (Siddell, 
1995b). Within group 1, TGEV, FCV, and CCV are particularly closely 
related, all the members of group 2 being tightly clustered. The sole 
member of group 3, IBV, not only differs extensively from all other 
coronaviruses but also exhibits extensive variation within the species. 

The Coronaviridae had remained a monogeneric family for a quarter 
of a century, until an accumulation of observations which showed that 
many of the features thought to be characteristic of the Coronaviridae 
applied equally well to the genus Torovirus, which had not been offi- 
cially assigned to a family (Figs. 1 and 2, Table II). Therefore, in 
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TABLE I 


SPECIES WITHIN THE GENERA CORONAVIRUS AND TOROVIRUS 


Antigenic Contain 


Species group HE gene 
Coronavirus 
IBV Avian infectious bronchitis virus 3 - 
MHV Murine hepatitis virus 2 + 
BCV Bovine coronavirus 2 + 
HCV (0C43) Human coronavirus 0C43 2 + 
HEV Porcine hemagglutinating encephalomyelitis 2 + 
virus 
TCV Turkey coronavirus 2 + 
TGEV Porcine transmissible gastroenteritis virus 1 oa 
FCV Feline coronavirus and feline infectious 1 = 
peritonitis virus (FIPV) 
CCV Canine coronavirus 1 - 
HCV (229E) Human coronavirus 229E 1 > 
PEDV Porcine epidemic diarrhea virus 1 - 
Torovirus 
BEV Berne virus (equine) + 
BRV Breda virus (bovine) NK? 


“NK, not known. 


1993, the International Committee for the Taxonomy of Viruses (ICTV) 
formally expanded the Coronaviridae to include Torovirus (Cavanagh 
et al., 1994, 1995). 

The bringing together of Coronavirus and Torovirus was not the end 
of the taxonomic story; another family, Arteriviridae, shared important 
characteristics in relation to the genome, structure, and strategies of 
transcription and translation (Table II) (Plagemann and Moennig, 
1992; Snijder and Spaan, 1995). However, the distinct morphology of 
the arteriviruses (Fig. 1), and their underlying differences from the 
coronaviruses in the size of the genome (Fig. 2) and structural proteins 
(Table II), precluded their inclusion in the Coronaviridae. The common 
features uniting the two families (Table II) are at the heart of a proposal 
that an order be created to contain Coronaviridae and Arteriviridae to 
reflect their common features and, probably, their evolutionary rela- 
tionships. The name Nidovirales, from the Latin nidus, meaning nest, 
has been designated for the order, as all members produce mRNAs in 
an extensive nested-set arrangement. 
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The remainder of this review is restricted largely to the coronavi- 
ruses. 


III. StRucTURE OF VIRIONS 


A. Virion Morphology 


Coronaviruses are enveloped, more or less spherical, approximately 
120 nm in diameter, with a prominent fringe of 20-nm-long, petal- 
shaped surface projections (spikes) composed of a heavily glycosylated 
type I glycoprotein, spike protein (S) (Fig. 1). A subset of the coronavi- 
ruses (Table I) has an additional layer of short spikes (Caul and Eggle- 
stone, 1977; Dea and Tijssen, 1988), which consist of hemagglutinin- 
esterase (HE) protein, also a type I glycoprotein. These small spikes 
are not essential for viral infectivity. Both the large and small spikes 
are anchored in the envelope, which is a lipid bilayer formed by virus 
budding from intracellular membranes. The envelope is associated 
with, in addition to the S and HE proteins, a smaller type III integral 
membrane protein (M), which spans the envelope three times. An even 
smaller protein [envelope (E) or small membrane (sM) protein] has 
recently been shown to be an integral membrane protein of the viral 
envelope. Inside the envelope is a ribonucleoprotein (RNP) core, which 
comprises the RNA genome and a single species of nucleocapsid pro- 
tein N. Electron microscopic observation of viral RNP showed a long 
helix of 14 to 16 nm (Macnaughton ef al., 1978; Sturman and 
Holmes, 1983). 

A very recent study of intact and detergent-treated TGEV virions 
(Risco et al., 1996) by negative-staining, ultrathin sectioning, freeze- 
fracture, immunogold mapping and cryoelectron microscopy showed a 
surprising new feature of coronavirus particles, namely, a spherical, 
probably icosahedral, core inside the virion (Fig. 3). These internal 
cores comprise not only the N protein and RNA but also the M protein, M 
being the major core shell component. Disruption of the cores released 
helical nucleocapsids. The presence of an icosahedral core in the corona- 
virus virion had heretofore been unsuspected. This core structure was 
also detected with MHV virion (Risco et al., 1996). This surprising new 
finding gives us cause to reconsider our view of coronavirus architec- 
ture. Thus, the precise structure of the core and RNP inside the virion 
is not certain. 

Toroviruses and coronaviruses have a similar morphology and virion 
composition (Fig. 1, Table IT) but are distinguishable in a number of 
ways (Table II) (Weiss and Horzinek, 1987; Snijder and Horzinek, 1993, 
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1995; Koopmans and Horzinek, 1994), necessitating their inclusion in 
separate genera. The morphology of the arteriviruses is substantially 
different from that of coronaviruses and toroviruses, particularly in 
having an icosahedral RNP core (Fig. 1) (Snijder and Spaan, 1995); 
hence, a separate family is maintained for arteriviruses. However, the 
recent discovery of the icosahedral core for coronavirus (Risco et ai., 
1996) may have blurred this distinction. 


B. Structural Proteins 
1. Spike Glycoprotein (S) 


The S glycoprotein is the outermost component of the virion, and is 
responsible for the attachment of the virus to cells (Collins ez al., 1982; 
Godet et al., 1994; Kubo et al., 1994) and for instigating the fusion of 
the virus envelope with cell membranes. It is the primary target for 
the host’s immune responses; neutralizing antibodies are induced 
mainly by S (Collins et a/., 1982), and immunization in animals with 
S alone can induce protection from some coronaviruses (Ignjatovic and 
Galli, 1994; Torres et al., 1995). Within a coronavirus species, sequence 
variation is usually exhibited more by S than by any other structural 
proteins; the variation of the S protein sequence probably confers a 
selective advantage in immune animals. These and other aspects have 
recently been reviewed in detail (Cavanagh et al., 1995). 

The S protein is large, ranging from some 1160 (IBV) to 1452 amino 
acids (FCV). There are many potential N-linked glycosylation sites (21 
to 35), most of which have glycans attached. The S preproprotein has 
a N-terminal signal sequence and a membrane-anchoring sequence 
near the C terminus (Fig. 4). The S protein may be cleaved into S1 
and S2 subunits; the extent of its cleavage varies greatly among the 
species (Cavanagh, 1995). A high proportion, up to 100%, of the S 
protein is cleaved in some coronaviruses (IBV, MHV, BCV, TCV, PEDV) 
(Cavanagh, 1983a); none is cleaved in others (TGEV, FCV, CCV) 
(Garwes and Pocock, 1975); and very little of the S protein of HCV- 
229E and HCV-OC43 is cleaved, although the S of OC43 is completely 


Fic 1. Models of the virions of a coronavirus, a torovirus, and an arterivirus. The 
HE protein is present only in antigenic group 2 coronaviruses (see Table I). Reproduced 
with permission from Cavanagh et al. (1994). 


coronavirus MHV 


ORF 1a ORF 1b 2 2-1 3 45-6 7 
31 kb 
torovirus BEV 
ORF 1a ORF 1b 2 345 
~25 kb 
arterivirus EAV 
ORF 1a ORF 1b 2345 67 
13 kb 


| Structural protein genes 
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TABLE II 


FEATURES OF CORONAVIRUSES, TOROVIRUSES, AND ARTERIVIRUSES 


Feature Coronavirus Torovirus Arterivirus 
Enveloped + + + 
Linear positive-sense ssRNA with poly(A) + + of 
tail? 
Genome organization* 
5’ polymerase gene-structural protein + + + 
genes 3’ 
3’ co-terminal nested set of =4 subgenomic + + + 
mRNAs? 
Leader sequence in mRNAs + - + 
Only 5’ unique region of mRNAs is + + + 
translationally active* 
Ribosomal! frameshifting in the polymerase + + + 
gene 
M protein with triple membrane-spanning + + oo 
sequences 
Intracellular budding + ~ + 
Genome size (kb) 27-31 ~25 13-15 
Nucleocapsid Helical* Tubular = Isometric 
Prominent spikes + + - 
Coiled-coil structure in spikes + + - 
Size of virion proteins (kDa) 
Large surface glycoprotein (S or G) 180-220 200 G, 30-42 
Gg 25 
Hemagglutinin-esterase protein (HE) 60-65° ¢ f 
Integral membrane protein (M) 25-35 26 18 
Small membrane protein (E) 10-12 ‘ 
Nucleocapsid protein 43-50 18 12 


“Primary common characteristics for inclusion of these viruses in the proposed or- 
der Nidovirales. 

+ Present in only a subset of coronaviruses (Table I). 

° May have an isometric core in addition (Risco et al., 1996). 

4 HE pseudogene known for BEV. 

* No such protein described. 


Fic 2. Comparison of the genome organization of a coronavirus (MHV), a torovirus 
(Berne virus, BEV), and an arterivirus (equine arteritis virus, EAV). The genes (num- 
bered) are drawn approximately to scale. The various coronaviruses differ with respect 
to the possession of an HE gene (see Table I) and with respect to the number and position 
of nonstructural protein genes. The polymerase genes encode two ORFs, 1a and 1b, which 
overlap. L, leader sequence; HE, hemagglutinin-esterase; 5, spike; E, small membrane 
protein; M, integral membrane protein; N, nucleocapsid protein; An, poly(A) tail; Gs and 
G,, small and large glycoproteins, respectively. 


10 MICHAEL M. C. LAI AND DAVID CAVANAGH 


ENV 


Fic 3. Model of the coronavirus virion based on the data of Risco et al. (1996) for TGEV. 
This model illustrates the observation that internal cores (IC), possibly icosahedral, were 
observed inside virions of TGEV. The cores comprise the helical ribonucleoprotein (NC) 
(genome RNA + N protein) and the M protein. Reproduced with permission from Risco 
et al. (1996). 


cleaved if trypsin is present (Hogue and Brian, 1986). The extent of S 
cleavage depends on the cell type (Frana et al., 1985), Cleavage gener- 
ates two glycopolypeptides, N-terminal S1 and C-terminal 82, the latter 
being acylated (Sturman et al., 1985). S1 is probably linked to the S2 
subunits by noncovalent linkage: trypsin treatment of MHV virions 
caused cleavage of all S proteins without disrupting the spikes (Stur- 
man et al., 1985); however, 81 can be released from virion by either 
urea or mild alkali treatment (Cavanagh and Davis, 1986; Sturman et 
al., 1990; Weismiller et al., 1990). 

Among the coronavirus genus as a whole, the S2 polypeptide is much 
more conserved than S1. Regions of up to 30% amino acid identity 
(particularly in the transmembrane domain) exist between the S2 poly- 
peptides of coronaviruses in the different antigenic groups, whereas 
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Fic 4. Features of the S protein, based on two MHV-JHM strains, (a) (S. E. Parker 
et al., 1989) and (b) (Schmidt et a/., 1987). The amino acid numbering has been normalized 
with respect to that of the longest known MHV S protein, that of MHV4 (JHM) (S. E. 
Parker et al., 1989). (a) The protein has an amino-terminal signal peptide (sp) and a 
transmembrane (TM) sequence near the C terminus. The glycosylated propolypeptide 
is cleaved at a basic connecting peptide (cp) to yield glycopolypeptides S1 and S2. The 
locations shown are those of three mutations present in mutants of MHV4 recovered 
from a persistently infected neural cell line, the mutants requiring a pH of 5.5-6.0 for 
membrane fusion (Gallagher et al., 1991). (b) S of another MHV-JHM (Schmidt e¢ al., 
1987), which has a 141-amino acid deletion with respect to (a). Bacterial expression 
products containing residues 33-40 and 1264-1276 bound MAb 11F and 10G, respec- 
tively, both of which neutralize virus infectivity and inhibit membrane fusion. The arrow 
indicates the positions of amino acid substitutions in JHM MAb 11F-resistant mutants 
(Grosse and Siddell, 1994). A peptide comprising residues 900-908 bound another MAb 
that neutralized virus and inhibited fusion (Luytjes et al., 1989). 


there is almost no conservation of the S1 sequence. Furthermore, com- 
parison of S1 sequences among strains of a given species, or between 
species of a given group, reveals hypervariable regions, which include 
frequent deletions, mutations, or recombination (Cavanagh et al., 1988; 
S. E. Parker et al., 1989; Banner et al., 1990; Gallagher et al., 1990), 
suggesting that this region is externally exposed and not essential for 
the structure. 

The S82 polypeptide has two regions with a seven-residue periodicity, 
forming heptad repeats (Fig. 4) indicative of a coiled-coil structure (de 
Groot et al., 1987). Indeed, current evidence suggests that the mature 
S protein forms an oligomer; for TGEV, it is probably a trimer (Delmas 
and Laude, 1990). However, a dimer structure has been proposed for 
IBV S protein (Cavanagh, 1983c). Therefore, the oligomeric S protein 
is envisaged as being anchored in the membrane by an a-helical region 
near to the C terminus of S2. Just beyond the outer membrane surface 
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is the shorter (minor) repeat structure predicted to be an a helix of 
5—7 nm. The major repeat indicates a helix of 10-13 nm, which may 
form the narrow stalk of the spikes (de Groot et al., 1987). All corona- 
virus $2 proteins have a highly conserved eight-residue sequence 
KWPWW/YVWL, the last five residues of which probably form the be- 
ginning of the membrane-spanning domain. Terminating 10 residues 
upstream of KWP is a leucine-zipper motif, the length varying from 
three to five heptad repeats (Britton, 1991). The highly conserved se- 
quences of S2 may play a role in forming the stalk, which has a more 
rigid structure. In contrast, the S1 domain is predicted to form the 
globular portion of the spikes, consistent with its highly variable 
nature. 

The S protein has two important biological activities for the virus: 

a. Induction of Membrane Fusion. This activity may be required for 
viral entry into cells or for cytopathic effects. Expression of the recombi- 
nant S gene has provided unequivocal evidence that the S protein 
alone is sufficient to cause membrane fusion, as shown by syncytium 
formation (de Groot et al., 1989; Pfleiderer et al., 1990; Yoo et al., 1991; 
Taguchi, 1993). Several regions of the S protein, widely separated in 
a linear sense, have been implicated in the membrane fusion process 
by the following observations: (1) S2 of BCV expressed in insect cells 
caused fusion (Yoo et al., 1991). (2) Amonoclonal antibody that inhibited 
cell fusion was shown to bind to the S2 domain of MHV (Fig. 4) (Luytjes 
et al., 1989). (3) Changes at three S2 residues (1067, 1094, and 1114 
in the MHV4 §S protein; Fig. 4) were associated with a change from a 
requirement for a neutral pH to an acidic pH for fusion (Gallagher e¢ 
al., 1991). (4) Two bacterial expression products containing residues 
33~40(S1) and 1264-1276 (S2) of the JHM strain of MHV induced mono- 
clonal antibodies 11F and 10G, respectively, both of which inhibited fu- 
sion (Fig. 4) (Routledge et al., 1991). (5) Chemical modifications of the 
cysteine residues, specifically residue 1163 in the ectodomain of S2, re- 
duced the fusion activity of the JHM strain of MHV (Gallagher, 1996). 
This result also suggests strain-specific differences in the conformation 
of the S protein, since the fusion activity of the A59 strain of MHV was 
not affected by this modification. (6) Some mutations to cysteine residues 
within the transmembrane domain of S adversely affected fusion, sug- 
gesting that the transmembrane domain is involved in conformational 
changes that are associated with fusion activity (Bos et al., 1995). These 
results combined suggest that the S2 ectodomain contains the major de- 
terminants for membrane fusion; however, S2 does not contain hydro- 
phobic domains typical of fusion proteins. Thus, several disparate re- 
gions, including some in the S1, may contribute to the fusion activity, 
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probably because some of these regions are juxtaposed in the three- 
dimensional structure or can affect the overall conformation of the 
spikes. Interestingly, monoclonal antibody-resistant mutants of the 
JHM strain of MHV selected with antibody 11F had mutations not at the 
antibody-binding site (residues 33-40 of S1), but at a distant site, i.e., 
residues 1109-1116 in the S2 domain (Grosse and Siddell, 1994) (Fig. 
4), suggesting that S is folded such that regions which are widely sepa- 
rated in the linear sense are juxtaposed to form functional domains. 

Early studies of coronvirus-induced cell-cell fusion suggested that 
only cleaved S was able to promote cell fusion (Sturman et al., 1985). More 
recent studies in which MHV S proteins with mutated S1-S2 connecting 
peptides were expressed have shown that cleavage is not essential for 
fusogenic activity, although cell-cell fusion is more efficient when the 
S protein is cleaved (Stauber et al., 1993; Taguchi, 1993; Bos e¢ ai., 
1995). Furthermore, naturally occurring mutants of MHV, derived from 
persistently infected mouse cells, which are defective in S cleavage, 
have delayed fusion activity (Gombold et al., 1993). Expression of the 
feline infectious peritonitis virus (FIPV) S protein, which is not cleaved 
at all, also resulted in syncytia formation (de Groot et al., 1989). These 
results indicate that S protein cleavage is not required for but can 
enhance membrane fusion. Whether membrane fusion activity, as man- 
ifested by syncytia formation, is required for viral infectivity has not 
been established. There are MHV strains (e.g., MHV-2) that do not 
cause syncytia formation in cultured cells; however, these viruses may 
be able to cause virus—cell membrane fusion within the infected cells. 

b. Receptor Binding. Monoclonal antibodies (MAb) against the S 
protein of most coronaviruses can neutralize viral infectivity; thus, it 
is assumed that the S protein mediates virus binding to the receptors 
on target cells. Indeed, the S protein or a portion of it can bind to the 
viral receptor molecules in vitro. This has been demonstrated for MHV 
and TGEV S proteins (Godet et al., 1994; Kubo et al., 1994). The binding 
domain has been mapped to the N-terminal 330 amino acids of MHV 
S1 protein. Site-directed mutagenesis of this region showed that muta- 
tions of the residues at position 62 and positions 212, 214, and 216 
abolished the binding of the protein to the receptor (Suzuki and Ta- 
guchi, 1996), suggesting that the receptor-binding site might comprise 
discontiguous regions in the linear sense. The S2 subunit is not involved 
in receptor binding (Taguchi, 1995). 

The receptor-binding sites of TGEV S protein have been mapped to 
a 223-residue region (aa 506-729) of the S1 (Godet et al., 1994), which 
overlaps with an epitope for a neutralizing MAb. This neutralizing 
MAb was able to block the binding of the 223-residue polypeptide to 
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the receptor; conversely, the receptor did not block the binding of the 
MAb to this polypeptide, suggesting that the receptor-binding determi- 
nants and the neutralizing epitopes are distinct and are part of a 
domain of S whose configuration is independent of the remainder of 
the S protein. 

S proteins of BCV and HCV-OC43 bind to 9-O-acetylneuraminic acid 
(Schultze et al., 1991a); this binding is required for viral infection. The 
significance of this binding will be discussed in Section V,B on virus 
attachment. Intriguingly, several coronavirus S proteins share some 
sequence identity with the receptor for the Fc fragment of mammalian 
immunoglobulins (Fcy receptor). Thus, MAb to the Fcy receptor could 
immunoprecipitate S protein from the MHV-infected cells, and S could 
bind to the Fe fragment of immunoglobulin. This molecular mimicry 
was first demonstrated for MHV and, more recently, for BCV and TGEV 
as well (Oleszak and Leibowitz, 19£0; Oleszak et al., 1992, 1995). It may 
play a role in modulating viral pathogenicity. This potential function is 
significant because expression of the S protein in the infected cells 
induces not only humoral antibodies but cellular immunity as well 
(Welsh et al., 1986); the potential binding of S to the Fc fragment of 
immunoglobulin may modulate these immune responses. 


2, Integral Membrane Glycoprotein (M) 


The M protein is one of only two of the structural proteins [the other 
being the E protein (see below)] that are essential for the production 
of coronavirus-like particles. The sequence of the M protein reveals 
that the M polypeptides comprise 225-230 amino acids, except for some 
members of the TGEV group, which have an additional 30 or so residues 
at the amino terminus, forming a cleavable membrane insertion signal. 
The amino-terminal 20 or so residues of the mature M protein of all 
the coronaviruses are hydrophilic, exposed at the virion surface, and 
have a small number of glycosylation sites. Glycans are of the N-linked 
type for IBV and the TGEV group and O-linked for the MHV group 
(Rottier, 1995). The remainder of the N-terminal half of the molecule 
forms three helical membrane-spanning domains, although a mutant 
M protein which lacked all three of the membrane-spanning domains 
did associate with membranes in vitro (Mayer et al., 1988). The struc- 
ture of the C-terminal half is uncertain, but it is believed to be largely 
situated on the inside of the viral envelope, based on protease suscepti- 
bility (Rottier et al., 1984; Cavanagh et al., 1986b) and sequence-based 
predictions (Armstrong et al., 1984; Rottier et al., 1986). However, some 
M molecules of TGEV virions have the C terminus exposed at the virion 
surface (Laviada et al., 1990; Risco et al., 1995). Moreover, MAb specific 
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for the C-terminal 46 amino acids of M neutralized TGEV virions in the 
presence of complement and caused antibody-mediated, complement- 
dependent cytolysis of TGEV-infected cells (Risco et al., 1995). Studies 
with mutant MHV M proteins expressed from vaccinia virus recombi- 
nants had shown that some had the N terminus and others the C 
terminus at the luminal side of the endoplasmic reticulum, equivalent 
to the outer surface of virions (Locker et al., 1992b). Some molecules 
of one mutant M protein had both termini at the luminal surface, and 
other molecules had both termini at the cytoplasmic surface (Locker 
et al., 1992b, 1994). Thus, the precise topology and the structural role 
of the M protein are still not certain. Recent studies have shown that 
some M proteins are also associated with the RNP core of TGEV and 
constitute the outer shell of the internal core (Risco et al., 1996). This 
core-associated M can be clearly separated from the viral envelope. 
Therefore, M may play a dual structural role in forming both the enve- 
lope and the internal core of the virion. 

Several properties of the M protein suggest that it is involved in 
virus particle assembly: (1) The M protein binds to the purified nucleo- 
capsid in vitro (Sturman et al., 1980). (2) When the M protein was 
expressed alone, it was localized in the Golgi complex, near the location 
where virus particles bud (Tooze et al., 1984; Tooze and Tooze, 1985). 
However, recent studies showed that the site of M protein retention in 
the Golgi was slightly different from that for viral particle budding 
(Klumperman et al., 1994), suggesting that additional factors are in- 
volved in virus particle assembly. This will be discussed in Section V,H 
on virus assembly. 

The M protein of TGEV has an additional biological activity: induc- 
tion of a-interferon (Charley and Laude, 1988; Laude et al., 1992). 
Thus, it may play a role in viral pathogenesis. Monoclonal antibodies 
against the M protein do not neutralize viral infectivity, suggesting 
that M is not involved in receptor binding. However, some of these 
antibodies can neutralize viral infectivity in the presence of comple- 
ment (Collins et al., 1982; Laviada et al., 1990), indicating that part of 
the M protein is exposed on the virion surface. 


3. Hemagglutinin-Esterase Glycoprotein (HE) 


The HE glycoprotein —or perhaps one should say the HE gene—of 
coronaviruses is something of an enigma. Only coronaviruses belonging 
to the MHV group possess the HE gene (Table I). Even there, not all 
virus strains within a species express the HE protein (Luytjes et al., 
1988; Yokomori et al., 1991). As with many of the so-called nonstructu- 
ral protein genes of coronaviruses, the product of the HE gene is not 
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essential for viral replication, certainly not in the cell types used in 
the laboratory. The HE protein was first detected in BCV (King e¢ al., 
1985) and some MHV strains; however, acceptance of it as a legitimate 
virus-encoded protein was delayed because in one of the MHV strains 
studied most thoroughly, A59, virions lacked HE. The HE gene of A59 
was later shown to lack the initiation codon of the HE open reading 
frame (ORF); thus, the HE gene is a pseudogene in this (Luytjes et al., 
1988) and several other MHV strains (Yokomori et al., 1991). A com- 
plete, functional HE gene was subsequently identified in the JHM 
strain (Shieh et al., 1989) and several others (Yokomori et al., 1991). 

The HE glycoprotein, of approximately 65 kDa (424 amino acids 
in BCV), has been detected in virions of HEV, MHV, HCV-O0C43, 
BCV, and TCV. When analyzed under nonreducing conditions, the 
HE protein migrates as a dimer of approximately 140 kDa (King et al., 
1985). The mature protein is believed to exist in the virion as a dimer, 
anchored by the C terminus, forming a fringe of short spikes visualized 
by electron microscopy (Caul and Egglestone, 1977; Dea and Tijssen, 
1988). It is not known whether each spike consists of more than one 
HE dimer. 

Those coronaviruses which contain HE in their virions cause hemag- 
glutination much more efficiently than those that do not. Similar to the 
S protein, HE alone can mediate hemagglutination and hemadsorption 
(King et al., 1985; Hogue and Brian, 1986; Vlasak et al., 1988b; Deregt 
et al., 1989; Pfleiderer et al., 1991; Schultze et al., 1991a); however, 
HE seems to have weaker activity than S (Schultze et al., 1991a). HE 
binds to 9-O-acetylated neuraminic acid (Vlasak e¢ al., 1988b; Schultze 
et al., 1991a), which is also a target for S binding. Some HE-specific 
MAb can neutralize BCV infectivity (Deregt and Babiuk, 1987; Deregt 
et al., 1989). Thus, HE protein of BCV may participate in virus binding 
to the receptor. The relative importance of HE and S in hemagglutina- 
tion and tissue tropism of BCV is not known. 

As its name implies, the HE protein also has esterase activity; spe- 
cifically, it is a neuraminate-O-acetylesterase. It hydrolyzes the 9-O- 
acetylated sialic acid on erythrocytes, thereby reversing hemagglutina- 
tion induced by the HE or S protein; thus, HE is considered a receptor- 
destroying enzyme (Vlasak et al., 1988a,b; Yokomorie? al., 1989; Parker 
et al., 1990). The putative esterase active site is FGDS, encoded by 
amino acids 19-22 of the mature HE polypeptide of BCV (M. D. Parker 
et al., 1989; Kienzle et al., 1990). In these respects, it resembles the 
hemagglutinin-esterase-fusion (HEF) glycoprotein of influenza C vi- 
ruses, which also has hemagglutinating and 9-O-acetylated sialic acid- 
hydrolyzing esterase activities. Moreover, the HE protein of coronavi- 
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ruses shares some 29% amino acid identity with the HEF of influenza 
C virus, including conservation of the position of the putative esterase- 
active site FGDS and many cysteine residues (Luytjes et al., 1988; 
S. E. Parker et al., 1989; Kienzle et al., 1990; Zhang et al., 1991). Unlike 
the HEF protein of influenza C virus, which is cleaved into two subunits 
(Nakada et al., 1984), the coronavirus HE protein is not cleaved and 
lacks most of the C-terminal subunit of the HEF of influenza C virus. 

Because of the close relatedness between the coronavirus HE protein 
and the influenza C virus HEF protein, and because the HE gene is 
present in only one coronavirus group, it was proposed that the HE gene 
was acquired by a coronavirus as a result of recombination between 
an ancestral coronavirus and influenza C virus (Luytjes et al., 1988). 
Interestingly, the torovirus Berne virus also has an HE pseudogene 
(gene 4; Fig. 2) (Snijder and Horzinek, 1995), the amino acid sequence 
of which has approximately 30% identity with the C-terminal part of 
the coronavirus HE. 

The functional significance of HE for coronaviruses is not known. 
Among coronaviruses, only BCV requires HE for infectivity; however, 
the presence of HE may affect the pathogenicity of some coronaviruses, 
as evidenced by the findings that passive administration of HE-specific 
MAb in mice altered MHV pathogenicity and that MHVs with an HE 
have different neuropathogenicity from those without HE (Yokomori 
et al., 1992a, 1995). Conceivably, the presence of HE in an MHV may 
allow the virus to utilize an alternative receptor independently of the 
S protein. However, this is not the case, as evidenced by the finding 
that an MAb specific for the murine biliary glycoprotein molecule, which 
is the major MHV receptor recognized by S, inhibited the infectivity of 
an HE-containing MHV (Gagneten et al., 1995). Thus, the HE protein 
does not enable a virus to bypass the primary MHV receptor and may 
provide only an auxiliary function for virus binding to target cells. 


4. Small Membrane Protein (E) 


Until recently it was thought that coronaviruses possessed three (S, 
N, M) or four (including HE) structural proteins. It is now clear that 
coronaviruses, but not toroviruses, possess an additional virion protein, 
the E protein. It plays an essential role in virion assembly. It has 
been shown that the E and M proteins are the only two viral proteins 
absolutely required for virion assembly (Bos e¢ al., 1996; Vennema et 
al., 1996). This protein has been demonstrated for IBV (Smith et al., 
1990; Liu et al., 1991), TGEV (Godet et al., 1992) and MHV (Yu et al., 
1994). When the deduced E proteins of the other coronaviruses are 
taken into account, it transpires that the E proteins vary from 84 to 
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109 amino acids, corresponding to molecular weights of 9100 to 12,400 
(Siddell, 1995c). Siddell has highlighted a number of features common 
to all the E proteins, namely, a hydrophobic region of some two dozen 
residues, starting near the N terminus; a cysteine-rich region immedi- 
ately downstream from this; a conserved proline residue in the middle 
of the molecule, and otherwise very low amino acid identity in the 
genus as a whole; and an abundance of charged residues in the C- 
terminal half of the protein (Siddell, 1995c). 

It is now well established that this protein is associated with highly 
purified virion preparations (Liu et al., 1991; Godet et al., 1992; Yu et 
al., 1994). Liu and Inglis calculated the ratio of S:N:M:E proteins in 
virions of IBV-Beaudette strain to be 1:11:10:2, indicating an amount 
of E protein similar to that of S protein (Liu et al., 1991). In contrast, 
Godet et al. estimated that the S:M:E protein ratio in virions of TGEV 
was 20:300:1 (Godet et al., 1992), and Vennema et al. (1996) have 
suggested an M:E ratio of approximately 100:1 for virions of MHV. It 
is not clear why there is such a wide range of variations. 

The E protein in the cells is localized in the perinuclear region, with 
some migrating to the cell surface (Godet et al., 1992; Yu et al., 1994). 
Experimental evidence suggests that the E protein is anchored in the 
membrane by sequence in the N-terminal half of the molecule. Thus 
antibodies specific for epitopes in the C-terminal half of the TGEV E 
protein produced cell-surface fluorescence in paraformaldehyde-fixed, 
TGEV-infected cells (Godet et al., 1992), but the precise topology of the 
protein has not been elucidated. The role of the E protein in virion 
assembly will be discussed in Section V,H on virus assembly and re- 
lease. 

The E proteins of IBV and MHV are translated from the third and 
second ORFs, respectively, of mRNAs 3 and 5 of the respective viruses. 
Both of these are polycistronic mRNAs (see Figs. 5 and 7 and Section 
V,G,2). In contrast, in all other viruses, the E protein is derived from 
a monocistronic mRNA. The mechanism of translation of the IBV and 
MHV E proteins is discussed in Section V,G. 


5. Nucleocapsid Protein (N) 


The N protein is a 50- to 60-kDa phosphoprotein which, together 
with the genomic RNA, forms a helical nucleocapsid (RNP). The RNP 
of coronaviruses have been reported variously as being from 9-11 to 
14-16 nm in diameter (see Laude and Masters, 1995, for references). 
The N protein in RNP provides only limited protection to the RNA 
genome against ribonucleases. The N proteins vary from 377 to 455 
amino acids in length, are highly basic, and have a high (7-11%) serine 
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content, potential targets for phosphorylation. Sequence conservation 
within the genus is low. Thus, the N proteins of IBV and TGEV have 
only 29% identity with that of BCV. Even within the MHV group, the 
N proteins of MHV and BCV share only 70% identity, whereas the M 
proteins of these two viruses have 86% identity (Lapps et al., 1987). 
Based on sequence comparison, three structural domains in the N 
protein have been identified (Parker and Masters, 1990). The middle 
domain is an RNA-binding domain (Masters, 1992; Nelson and Stohl- 
man, 1993) which binds to both coronaviral and nonviral RNA se- 
quences in vitro (Robbins e¢ al., 1986; Stohlman et al., 1988; Masters, 
1992); however, it does not contain any motifs characteristic of other 
RNA-binding proteins. Under specific binding conditions, the MHV N 
protein binds to the leader RNA sequence, particularly nucleotides 
56-67 (Stohlman et al., 1988). Furthermore, an anti-N MAb immuno- 
precipitated all of the MHV RNA molecules which had the leader se- 
quence (Baric et al., 1988). The N protein of IBV also bound to the 3’ 
untranslated region of the IBV RNA in vitro (Zhou et al., 1996). These 
RNA-binding properties are consistent with the fact that the N protein 
interacts with the viral genomic RNA to form nucleocapsid. This inter- 
action is necessary for the formation of virus particles, as N alone cannot 
be incorporated into virus particles, whereas the N-RNA complex can 
(Bos et al., 1996; Vennema et al., 1996). However, the specificity of the 
RNA-N protein interaction required for nucleocapsid formation has 
not been elucidated. The N protein also binds to membranes and phos- 
pholipid (Anderson and Wong, 1993). This may be another property 
which facilitates the formation of virus particles. 

The finding that the N protein binds to the 5’ and 3’ ends of viral 
RNA suggests that the N protein may also modulate viral RNA synthe- 
sis because the ends of the RNA are likely involved in the regulation 
of RNA synthesis. In an in vitro RNA replication system, the addition 
of MHV N-specific antibodies inhibited viral RNA synthesis (Compton 
et al., 1987), suggesting that the N protein is a component of the RNA- 
synthesizing machinery. The ability of N to bind to the membrane 
(Anderson and Wong, 1993) may enable the formation of the RNA 
replication or transcription complex, in view of the fact that viral RNA 
synthesis occurs in the membrane fraction of infected cells (Brayton et 
al., 1982; Dennis and Brian, 1982). 

The three structural domains of the N protein are separated by 
spacer regions, which are not conserved (Masters, 1992). The functions 
of the N- and C-terminal conserved domains are not yet clear. Using 
a targeted recombination approach (Koetzner ef al., 1992; Masters et 
al., 1994) to generate recombinant viruses that have a chimeric N gene 
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containing parts of BCV and MHV sequences, Peng et al. (1995a) have 
shown that there is strict sequence specificity within the conserved 
structural domains for viable recombinants. Since the N protein consti- 
tutes the nucleocapsid, mutations within the N protein will likely affect 
the stability or viability of the virus. Indeed, several temperature- 
sensitive and thermolabile mutants of MHV have deletions or muta- 
tions within the N protein (Koetzner et al., 1992; Peng et al., 1995b). 
Viruses with site-specific mutations of the N gene have been generated 
by targeted recombination techniques; interestingly, revertants of 
these mutants often have second-site mutations located at different 
domains, suggesting that there are interactions between different do- 
mains of the N protein (Peng et al., 1995b). 

The role of phosphorylation in the N protein has not been elucidated. 


C. RNA Genome 


The coronavirus contains a positive-sense, single-stranded RNA ge- 
nome, which is the largest viral RNA genome known, ranging from 
27.6 to 31 kb. The large size of the viral RNA requires the virus to 
develop special mechanisms of RNA synthesis to counter the deleterious 
effects of the possible errors during RNA synthesis. The virion RNA 
functions as an mRNA and is infectious. It contains approximately 
7-10 functional genes, 4 or 5 of which encode structural proteins. The 
genes are arranged in the order 5'-polymerase-(HE)-S-E-M-N-3', with 
a variable number of other, mostly nonstructural and largely nonessen- 
tial, genes interspersed among them (Fig. 5). This gene arrangement 
also applies to toroviruses and arteriviruses (Fig. 2). The 5’ terminus 
of the coronavirus genome is capped, and the RNA starts with a leader 
sequence of 65-98 nucleotides, which is also present at the 5’ end of 
mRNAs, followed by a 200- to 400-nucleotide untranslated region 
(UTR). At the other end of the genome is a 3’ UTR of 200-500 nucleo- 
tides followed by a poly(A) tail. Almost two-thirds of the entire RNA 
is occupied by the polymerase gene, which comprises two overlapping 
ORFs, la and 1b. At the overlap region is a specific seven-nucleotide 
“slippery” sequence and a pseudoknot structure, characteristic of the 
ribosomal] frameshifting signal (Brierley et al., 1987, 1989; Lee et al., 
1991; Herold and Siddell, 1993), which is required for the translation 
of ORF 1b. The architecture of the nonstructural protein genes inter- 
spersed between the known structural protein genes varies signifi- 
cantly among different coronavirus species (Fig. 5). For example, in 
HCV-229E, gene 3 contains two ORFs, whereas in the related virus 
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PEDV these two ORFs are fused (Duarte et al., 1994). In HCV-OC43, 
gene 4 is missing altogether (Mounir and Talbot, 1993). Finally, in 
IBV, two ORFs are inserted between M and N genes. The variability 
of gene structure indicates the plasticity of coronavirus RNA and the 
frequent occurrence of recombination and also suggests that there is 
no strong conservation pressure on these nonstructural proteins. There 
is a stretch of consensus sequence, UCUAAAC (for MHV), or a related 


MHV 
IBV 
L : | 2 3 as | 6 7 
TaEV iy ae ZZ, SN 
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HCV-229E E 4 2 R415 6 
(PEDV) Cf WAS 
la 1b S aN EO MON 
FCV 
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BCV 
(HCV-0C43) la 1b HE S abe MON 


Fic 5. Comparative genome structure of the different coronaviruses. The complete 
sequences are available for MHV, IBV, TGEV, and HCV-229E. The gene 1 sequences of 
the remaining viruses have not been completed. Gene 1 sequences are interrupted and 
shortened to highlight the remaining genes. The vertical lines represent mRNA start 
sites; thus, each region between two vertical lines represents a separate gene (“transcrip- 
tion unit”). The structural protein genes are marked by various symbols, and nonstructu- 
ral protein genes are represented by unfilled boxes. The gene arrangements of ns protein 
genes and E protein gene are very heterogeneous in terms of transcription unit and the 
relative size and position among different strains of the same virus species; only the 
representative one is presented. The numbering system for the genes of HCV-229E 
deviates from the published one (Herold e¢ a/., 1993) to be consistent with the other 
viruses. HCV-OC43 does not have a gene 4. 
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sequence, at sites immediately upstream of most of the genes. These 
sequences represent signals for transcription of subgenomic mRNAs 
(see Section V,E). 

Finally, a pseudoknot structure has been shown to be present at the 
3’ end of the coronaviral RNA (Williams et al., 1995). A characteristic 
feature of the Coronaviridae, and of the Arteriviridae as well, is that 
all known member species generate a 3’-coterminal nested set of five or 
more mRNAs (see Fig. 7). Each coronavirus and arterivirus subgenomic 
mRNA has the leader sequence at its 5’ end. Curiously, no leader RNA 
sequence is present in the torovirus RNAs (Fig. 2). 


IV. NONSTRUCTURAL PROTEINS 


In 1990, the Coronavirus Study Group published its recommenda- 
tions for the nomenclature of coronavirus genes, mRNAs, and proteins 
(Cavanagh et al., 1990). At that time it was reluctant to apply the term 
“nonstructural” to the potential products of genes which were suspected 
of not being structural proteins. This caution was a consequence of our 
lack of knowledge of those gene products, a situation which has im- 
proved greatly in the last 5 years or so. This has resulted in the term 
“nonstructural (ns)” being applied more widely to several gene products. 
Every gene that encodes the ns proteins has been deleted in at least 
some naturally occurring virus isolates; thus, most of the ns genes are 
not essential for viral replication. However, some of the ns proteins 
may play a role in viral tissue tropism or pathogenicity. 


A. The Polymerase 


The polymerase is encoded by gene 1, which accounts for approxi- 
mately two-thirds of the genome (Fig. 2). The complete polymerase 
gene of four coronaviruses (IBV, MHV, HCV-229E, and TGEV) covering 
each of the three coronavirus groups has been sequenced (Boursnell et 
al., 1987; Lee et al., 1991; Herold et al., 1993; Bonilla et al., 1994; 
Eleouet et al., 1995). Although the polymerase genes vary in size from 
approximately 18 to 22 kb, the encoded proteins have many structural 
features in common. The degree of amino acid identity for this gene 
product is greater than is observed for any other coronavirus gene 
product. 

The polymerase gene is predicted to encode a protein of approxi- 
mately 740-800 kDa. Proteins of this size have not been detected in 
coronavirus-infected cells, in part because of co-translational polypro- 


MOLECULAR BIOLOGY OF CORONAVIRUSES 23 


tein processing. The pol gene encodes two ORFs, la and 1b, which 
overlap by a few dozen nucleotides (Figs. 2 and 6). The second, ORF 
1b, is in the -1 reading frame with respect to the upstream ORF la 
and is translated following ribosomal frameshifting in the overlap re- 
gion. This will be examined in more detail in Section V,G. 

There is greater amino acid identity among the 1b than the 1a ORFs. 
For example, 1a and 1b of IBV, the least typical coronavirus in terms of 
protein sequences, have amino acid identity/similarity of approximately 
30/50% and 55/79%, respectively, compared with those of MHV, HCV, 
and TGEV. It is the 1a ORF which accounts for the MHV polymerase 
gene being approximately 1-2 kb longer than those of IBV, HCV, 
and TGEV. 

A number of functional domains within pol have been predicted 
following computer-based motif analyses (Boursnell et al., 1987; Hodg- 
man, 1988; Gorbalenya et al., 1989a,b; Lee et al., 1991); some of these 
functional domains have been confirmed by experimental analysis. The 
location of these motifs is illustrated in Fig. 6. Three motifs have been 
identified in ORF 1a, indicating the presence of one or two papain-like 
cysteine proteases (PLP): a chymotrypsin/picornaviral 3C-like protease 


ORFs ORF1b 


(a) 


PLP1 X PLP2 MD 3CLP MD GFL 


(b) 


A autoprotease C (-) strand synthesis D mRNA synthesis 


(c) 


(d) 


Fic 6. Features of the coronavirus polymerase gene, based on that of MHV (Lee et 
al., 1991). (a) The polymerase gene comprises two ORFs, 1a and 1b, which overlap, the 
lb ORF being translated after ribosomal frameshifting. (b) The positions of motifs: PLP 
1 and 2, papain-like protease; X domain, highly conserved between IBV and MHV; 3CLP, 
picornavirus-3C-like protease; MD, membrane-associated domain; GFL, growth factor- 
like; POL, RNA-dependent RNA polymerase; MB, metal-binding motif; HEL, helicase. 
(c) Genetic complementation groups (Schaad et al., 1990; Fu and Baric, 1994), 
(d) Processing scheme for part of the 1a ORF (Denison et al., 1992, 1995). 


24 MICHAEL M. C. LAI AND DAVID CAVANAGH 


(3CLP) and a cysteine-rich growth factor-related protein (GFL). MHV, 
HCV-229E, and TGEV have two PLP domains (1 and 2), with PLP2 
corresponding to the single PLP domain of IBV. Sequence correspond- 
ing to a cysteine protease of Streptococcus pneumoniae has been identi- 
fied in 1a of IBV. Upsteam of PLP2 is a region termed the X domain, 
a region of particularly high conservation between IBV and MHV 
and similar to one near the thiol protease of alpha- and rubiviruses 
(Gorbalenya et al., 1991). There is no functional evidence so far to link 
the GFL with known growth factors, but the predictions of most of the 
protease domains have been confirmed by experimental analysis. The 
first PLP domain of MHV is responsible for the cleavage of p28/p30 
and p65 from the N terminus of the MHV ORF 1a polyprotein (Fig. 6) 
(Baker et al., 1989, 1993; Bonilla et al., 1995, 1997). This PLP was 
inhibited by zinc chloride but not by leupeptin (Baker et al., 1989; 
Denison et al., 1992). Deletion analysis defined this proteinase domain 
to be within the sequence encoded by the 3.6—4.4-kb region from the 
5' end of the genome. Site-directed mutagenesis showed that residues 
Cys-1137 and His-1288 were essential for protease activity (Baker et 
al., 1993). Some amino acid sequences between the p28 cleavage site 
and the PLP domain were also essential for the cis cleavage that gener- 
ates p28 (Baker et al., 1993; Bonilla et al., 1995). The function of PLP2 
has not been demonstrated. 

The 3CLP domain extends for approximately 300 amino acids and 
is homologous to proteases encoded by picornaviruses and several other 
virus genera. The putative 3CLP domain of HCV-229E has been ex- 
pressed as a #-galactosidase fusion protein in Escherichia coli and 
shown to have autocatalytic proteolytic activity, releasing an active 
3CLP protein (Ziebuhr et al., 1995). An antiserum against this fusion 
protein immunoprecipitated a 34-kDa protein from HCV-229E-infected 
cells. Similar activity has been demonstrated for the 3CLPs of MHV 
(Lu et al., 1995) and IBV (Tibbles et al., 1996). This protease cleaves 
not only its own boundaries but also several downstream sites within 
ORF 1a and ORF 1b, probably both in cis and in trans. Computer 
analysis predicted that the catalytic center of the IBV 3CLP would 
include Cys-2922, His-2820, and Glu-2843 (Gorbalenya et al., 1989a,b). 
Site-directed mutagenesis confirmed the role of the Cys and His resi- 
dues but showed that the Glu residue was not essential (Liu and Brown, 
1995). The same approach confirmed that the predicted QS(G) dipeptide 
bonds in the 1b ORF are the targets for the protease activity of the 
8CLP of IBV (discussed further in Section V,G). Similar conclusions 
were reached for 3CLP activity of MHV and HCV-229E (Lu et al., 1995; 
Grotzinger et al., 1996). The importance of Cys-3495 in the 3CLP of 
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MHV has been demonstrated (Seybert et al., 1997). In vitro transcrip- 
tion and translation of a cDNA containing the putative 3CLP of MHV 
produced polypeptides of 38 and 33 kDa, which were subsequently 
processed to products of 32 and 27 kDa (Lu et al., 1995). The 27-kDa 
protein possesses the 3C-like protease activity (Lu et al., 1996). The 
3CLP domain is flanked by predicted membrane-spanning domains, 
which may be important for the proteolytic activity (Tibbles et al., 1996) 
(Fig. 6). 

Poor expression of the IBV 3CLP protein in vitro led to the discovery 
that this protease was ubiquinated and subsequently degraded by an 
adenosine triphosphate (ATP)-dependent protease present in reticulo- 
cyte lysate (Tibbles et al., 1995). This is the third example of a viral 
protein subject to turnover in this manner and involves a different 
virus class from the previously reported examples, in a picornavirus 
(Oberst et al., 1993) and an alphavirus (de Groot et al., 1991). The 
ubiquitin-mediated, ATP-dependent proteolytic pathway is a major cel- 
lular, nonlysosomal, protein degradation system, which may cause 
rapid turnover of the coronaviral polymerase. 

The functional domains associated with RNA synthesis are located 
within the more conserved 1b ORF. These include domains for an RNA- 
dependent RNA polymerase, a nucleoside triphosphate (NTP)-binding/ 
helicase domain, and a zinc-finger nucleic acid-binding domain (metal 
binding domain) (Fig. 6). Computer analyses identified the polymerase 
domain (Boursnell et al., 1987; Hodgman, 1988; Gorbalenya et al., 
1989a,b). Unlike the GDD motif present in many viruses, the corre- 
sponding sequence in coronaviruses is SDD. Whether the polymerase 
gene products contain activities other than proteases and polymerases 
is not known. 


B. Other Nonstructural (ns) Proteins 


The coronaviruses exhibit great heterogeneity with respect to the 
number and genome location of ns protein genes and in regard to the 
number of ORFs within a gene (Fig. 5). The functions of these ns 
proteins are still unknown. 


1. Genes between the Polymerase and S Gene (Gene 2 of MHV, BCV, 
and HCV-0C43) 


There are two genes located between the polymerase and S genes 
of these viruses (Fig. 5). Gene 2-1 encodes the HE protein, while gene 
2 encodes an ns protein of unknown function. The gene 2 protein com- 
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prises approximately 260 amino acids (30 kDa) (Luytjes et al., 1988; 
Shieh et al., 1989; Labonte et al., 1995). The BCV and MHV homologs 
share 45% amino acid identity, while the homolog of HCV-OC43 has 
92% identity with that of BCV. This gene product has been detected 
in the cytoplasm of MHV-, BCV-, and HCV-0C43-infected cells and 
may be phosphorylated (Bredenbeek et al., 1990; Zoltick et al., 1990; 
Cox et al., 1991; Labonte et al., 1995). Computer analysis of its sequence 
suggested the presence of a nucleotide-binding site (Luytjes et al., 1988). 
However, no function has been assigned to this protein, and it is not 
required for virus replication in culture (Schwarz et al., 1990). Interest- 
ingly, the C terminus of the torovirus ORF 1a product (polymerase) 
has 31-36% sequence identity with the gene 2 product of MHV (Snijder 
et al., 1991). This evolutionary relationship between coronavirus and 
torovirus suggests that the gene 2 product is probably involved in 
viral RNA synthesis, since it is expressed as part of the torovirus 
polymerases. 


2. Genes between S and E (Genes 3 and 3-1 of IBV, TGEV, HCV- 
229K, and FCV and Gene 4 of MHV and BCV) 


There are two to three ORFs in this region, and their structure and 
the mechanism of expression of gene products vary markedly among 
different coronavirus species. They can be expressed as two different 
genes, i.e., expressed from two separate mRNAs (e.g., mRNAs 4 and 5 
of MHV and BCV and mRNAs 3 and 3-1 of TGEV) or localized in one 
gene, thus requiring internal initiation of translation from a single 
polycistronic mRNA (e.g., mRNA 3 of the IBV and FCV groups). In 
IBV, it contains three ORFs (8a, 3b, and 3c); ORF 3c encodes the E 
protein, which is a viral structural protein, while 3a and 3b encode ns 
proteins. The gene products of both ORFs 3a and 3b (approximately 
7 kDa) have been detected in small quantities in virus-infected cells 
(Liu et al., 1991). In TGEV, this region contains two ORFs, being sepa- 
rated from the E protein gene. These two ORFs are encoded by mRNAs 
3 and 3-1, respectively, the predicted protein products being approxi- 
mately 8 and 27 kDa, respectively. In a related nonenterogenic strain, 
PRCV, however, there are multiple deletions in this region, essentially 
inactivating one or both of the ORFs (Rasschaert et al., 1990; Wesley 
et al., 1991). It has been suggested that the absence of the 3a product, 
in addition to a shorter S protein, might be associated with their lack 
of enteropathogenicity. However, Vaughn et al. (1995) have recently 
described two PRCV strains which have an intact 3a gene (Vaughn et 
al., 1995). 
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Canine coronavirus has gene 3 ORFs equivalent to those of TGEV, 
exhibiting high amino acid identity (>80%), although the second ORF 
is truncated by a stop codon (Horsburgh et al., 1992). Two other mem- 
bers of the TGEV group exhibit a variation on the same theme. PEDV 
and HCV-229E lack a homolg of ORF 3a of TGEV and CCV. PEDV has 
an ORF corresponding to 3b of TGEV, while HCV-229E has two ORFs 
corresponding to the single ORF of PEDV (Duarte et al., 1994). 

Members of the group I coronaviruses also exhibit great heterogene- 
ity in this region. MHV-JHM produces mRNA 4, which encodes a 
15-kDa protein. This protein has been detected in virus-infected cells 
(Ebner et al., 1988). In contrast, HCV-OC43 contains only 11 amino 
acids in this region (Mounir and Talbot, 1993). Gene 5 of MHV has 
two ORFs, 5a and 5b. The latter encodes the structural E protein and 
is the predominant product made from mRNA 5 (Leibowitz et al., 1988). 
It is not clear whether ORF 5a is translated at all. At least one strain 
of MHV lacks the 5a ORF (Yokomori and Lai, 1991); also, HCV-OC43 
has the 5a ORF but is unable to produce a corresponding mRNA 
(Mounir and Talbot, 1993). 

In summary, there is great heterogeneity with respect to the number, 
size, and mechanism of expression of ORFs between the S and E genes. 
These ns proteins probably are not required for viral replication. The 
lack of necessary function may account for the heterogeneity which 
arose during evolution. 


3. Gene 5 (between M and N Genes) of IBV 


IBV is unique in that it has two ORFs (5a and 5b), which encode 
proteins of 7.4 and 9.5 kDa, respectively. These proteins have been 
detected in very small amounts in virus-infected cells (Liu and Inglis, 
1992a). The function of these ORFs is not clear. 


4. ORFs in the Very 3' End 


TGEV has an additional gene 7, which encodes a 9.1-kDa protein 
(Garwes et al., 1989; Tung e¢ al., 1992), in the region corresponding to 
the 3’ end untranslated region of other viruses (Fig. 5). This protein 
is hydrophobic and is associated with the endoplasmic reticulum and 
cell surface membranes (Tung et al., 1992), but its nuclear localization 
has also been reported (Garwes e¢ al., 1989). FCVs and CCV have two 
ORFs in the same region, the first being analogous to the single ORF 
of TGEV. The second (7b) ORF encodes a 14-kDa soluble protein con- 
taining the sequence KTEL (Vennema ef al., 1992), which is similar 
to the endoplasmic reticulum retention signal, KDEL. The protein is 
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partially retained in the endoplasmic reticulum but is also slowly se- 
creted out of the cells. The functions of these proteins are not known. 


V. REPLICATION CYCLE 


A. Viral Host Ranges and Metabolic Requirements of 
Viral Replication 


Coronaviruses have relatively restricted host ranges, infecting only 
their natural hosts and closely related animal species. Occasionally, 
cross-species infection of coronaviruses occurs, such as the experimen- 
tal infection of monkey by MHV, which causes central nervous system 
demyelination (Murray et al., 1992; Cabirac et al., 1994), and the occa- 
sional infection of humans by BCV, which causes diarrhea. BCV also 
infects turkeys and TGEV infects dogs, suggesting some flexibility in 
their host range. The expansion of viral host range can be achieved by 
passing the coronavirus in a heterologous cell line, as demonstrated 
by the emergence of an MHV variant with the ability to infect originally 
nonpermissive cell lines, such as human cells, after serial passages 
(Baric et al., 1997). In animals, coronaviruses have restricted tissue 
tropism; for example, most HCV strains cause only respiratory infec- 
tions. Different strains of a coronavirus may have distinct tissue speci- 
ficity; for example, TGEV infects both the gastrointestinal tract, caus- 
ing fatal diarrhea, and respiratory tract tissues without causing 
primary respiratory symptoms, whereas PRCV, which is closely related 
to TGEV, infects the respiratory tract of pigs but replicates poorly in 
the intestinal tract (Cox et al., 1990). The species and tissue specificity 
of a coronavirus infection is at least partially dictated by the nature 
and distribution of cellular receptors and other related molecules that 
regulate virus entry, as evidenced by the viral replication that results 
when viral RNA is directly introduced into cell types of other animal 
species. Thus, coronaviruses have the potential to replicate in many 
cell types. 

The complete coronavirus replication cycle takes place in the cyto- 
plasm. It has been shown that MHV can replicate in enucleated cells 
and in the presence of actinomycin D and a-amanitin, suggesting that 
nuclear functions are not required for viral replication (Brayton et 
al., 1981; Wilhelmsen e¢ al., 1981). There are, however, reports of the 
inhibition of replication by actinomycin D of some coronaviruses, includ- 
ing feline enteric coronavirus (Lewis et al., 1993), IBV (Evans and 
Simpson, 1980), HCV-229E (Kennedy and Johnson-Lussenberg, 1978), 
and MHV in some cell lines (Dupuy and Lamontagne, 1987). Thus, 
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nuclear functions may be required for viral replication under certain 
conditions. This issue has not been resolved. 


B. Virus Attachment 
1. Virus Binding to Erythrocytes 


The first step in viral infection is the binding of the virus to target 
cells. Hemagglutination and hemadsorption have been used as assays 
for studying virus—cell interaction, although the erythrocyte itself is not 
a target cell for coronavirus infection. Several coronaviruses, including 
HEV, IBV, BCV, and some strains of MHV and HCV, can cause hemag- 
glutination (Sugiyama and Amano, 1980; Schultze et al., 1990; Zhang 
et al., 1994a). The binding residue on the cell surface is a 9-O-acetylated 
neuraminic acid of glycoproteins or glycolipids (Schultze et al., 1990), 
although different coronaviruses may prefer different structural iso- 
forms of 9-O-acetylated neuraminic acid. For BCV, the virus binding 
to erythrocytes is mediated through either the S or HE protein, both 
of which have hemagglutinating activities, the S protein having the 
stronger activity (King et al., 1985; Schultze et al., 1991a,b). The HE 
protein of BCV and HEV also recognizes 9-O-acetylated neuraminic 
acid, and its esterase activity is also specific for this molecule; thus, 
HE protein has both receptor-binding and receptor-destroying activities 
(Vlasak et al., 1988a,b; Schultze et al., 1991b). Expression of the HE 
protein of MHV on the cell surface conferred a hemadsorption activity 
(Pfleiderer et al., 1991); however, even viruses that lack HE protein 
(e.g., IBV) can cause hemagglutination, suggesting the role of S protein 
in hemagglutination. Thus, the HE and S proteins of various coronavi- 
ruses may have comparable functions, enabling the virus to bind the 
sialic acid residues; however, only the HE protein confers the receptor- 
destroying activity. The residue necessary for hemagglutination by 
IBV is A2,3-linked N-acetylneuraminic acid (Schultze et al., 1992). 
Curiously, the hemagglutinating activity of IBV is not evident until 
the virus particle is treated with neuraminidase, suggesting that the 
S protein itself is covered by sialic acid. Although virus binding to 
erythrocytes provides a good model system for studying virus—cell inter- 
actions, it may not necessarily reflect the actual mechanism of virus 
attachment to target cells. 


2. Virus Binding to Target Cells 


The classical study of virus attachment to target cells involved the 
in vitro binding of MHV to macrophages from genetically susceptible 
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and resistant mouse strains (Shif and Bang, 1970). This study showed 
that MHV bound equally well to cells from resistant and susceptible 
mice, even though macrophages from resistant mice were resistant 
to virus infection. Similar observations have been made on splenic 
lymphocytes, thymocytes (Krzystyniak and Dupuy, 1981), and glial 
cells (Wilson and Dales, 1988); thus, it appears that genetic resistance 
is not exerted at the level of virus binding in these cases. Similarly, 
established tissue culture cell lines, including murine and primate 
cells, irrespective of their degree of susceptibility or resistance to MHV, 
bound MHV to the same extent (van Dinter and Flintoff, 1987; Kooi 
et al., 1988, 1991). Thus, virus may bind to a ubiquitous molecule on 
the cell surface, which, however, may not lead to virus infection. 
Whether this ubiquitous molecule is a sialic acid-containing glycopro- 
tein has not been established. 

The binding of BCV to its target cells, such as MDCK cells, is medi- 
ated by 9-O-acetylneuraminic acid residues similar to those on erythro- 
cytes. Removal of the sialic acid by neuraminidase abolished virus 
attachment, while resialization restored it (Schultze and Herrler, 
1992). HCV-OC43 binds to a similar sialic acid residue but prefers a 
form slightly different from that for BCV (Kunkel and Herrler, 1993). 
The HE protein of BCV can also mediate virus binding to target cells, 
and this binding may be required for viral infection, as suggested by 
the finding that MAb against HE inhibited BCV infectivity (Deregt 
and Babiuk, 1987; Deregt et al., 1989). One inhibitor of the esterase 
activity of HE protein, diisopropylfluorophosphate, also inhibited BCV 
infection (Vlasak et al., 1988a). The S protein of BCV probably also 
participates in virus binding to target cells, as suggested by the finding 
that the MAb against S protein can neutralize BCV infectivity (Deregt 
et al., 1989). The relative importance of S and HE proteins is not clear. 
In contrast, none of the MAb against the HE protein of MHV inhibited 
MHV infection (Yokomori et al., 1992a). Despite the finding that the 
binding of HE and S proteins to target cells is necessary for BCV 
infection, the binding of BCV or HCV-OC43 to N-acetylneuraminic acid 
in itself is not likely the basis of viral cell tropism because sialic acid 
is a common cell surface carbohydrate residue; thus, an additional, 
more cell type-specific molecule is probably required for viral infection. 


3. Specific Virus Receptors 


The finding that MHV and other coronaviruses bound to resistant 
as well as susceptible cells indicates that this binding may represent 
an initial step in the virus attachment process, which is not sufficient 
for viral infection. It is likely that a more specific binding between 
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virus and cells is required for the establishment of viral infection. This 
binding involves a specific virus receptor molecule on the cell surface. 

a. MHV Receptor. The MHV receptor was the first coronavirus re- 
ceptor to be identified. It is the murine homolog of a member of the 
carcinoembryonic antigen (CEA) family (Dveksler et al., 1991; Williams 
et al., 1991) and belongs to the biliary glycoprotein (bgp) subfamily. 
The terminology of MHV receptors in the literature is somewhat contro- 
versial, the following terms being used interchangeably: mmCGM1, 
MHVE-1, and BgpA. It has an immunoglobulin-like structure, consist- 
ing of four immunoglubulin-like loops, the N-terminal loop being the 
virus-binding domain (Dveksler et al., 1993b). The sequence of the C 
terminus (cytoplasmic domain) of the receptor is not essential. Glycosyl- 
ation of the protein also is not necessary for its receptor function in 
vivo (Dveksler et al., 1995). The functional significance of the receptor 
in viral infection in vivo was demonstrated by the finding that an MAb 
against the MHV receptor inhibited viral infection in mice (Smith et 
al., 1991). 

Subsequently, several additional members of CEA family were found 
to serve as MHV receptors, including an mmCGM2-like protein (also 
termed MHVR-2 and BgpB), which is the product of an alternatively 
spliced form of mmCGM1 RNA and is expressed in both the liver and 
brain, in contrast to the liver-specific expression of mmCGM1 (Yoko- 
mori and Lai, 1992a; Dveksler et al., 1993a); an allelic gene product of 
the bgp gene in SJL mice, a mouse strain resistant to MHV infection 
(Yokomori and Lai, 1992b; Dveksler et al., 1993a); Bgp-2, which is the 
product of a new member of the murine Bgp gene (Nedellec et al., 1994); 
and a novel pregnancy-specific glycoprotein (psg)-like protein, which 
is expressed in the mouse brain, in contrast to placenta-specific expres- 
sion of other psg molecules (Chen e¢ al., 1995). All these molecules 
contain a consensus motif in the virus-binding domain (N-terminal 
loop). Thus, several different CEA family members, which are differen- 
tially expressed in different cells and tissues, can potentially serve as 
an MHV receptor. Different strains of MHV may use different CEA- 
related molecules as receptors at different efficiencies (Compton, 1994; 
Chen et al., 1995). The prototype MHV receptors (MHVR-1) are ex- 
pressed in the liver, gastrointestinal tract, B cells, macrophages, and 
endothelial cells but not in T cells (Coutelier et al., 1994; Godfraind et 
al., 1995), consistent with the target cell specificity of MHV. However, 
the MHV receptor is also expressed in other tissues, e.g., kidney, which 
are not targets for MHV infection. Also, SJL mice express a functional 
MHV receptor (Yokomori and Lai, 1992b; Dveksler e¢ al., 1993a) but 
are resistant to MHV infection (Knobler et al., 1984). Thus, receptor 
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expression is not sufficient for viral infection. It is not yet clear which 
molecules are used by MHV as receptors in cross-species infection (e.g., 
rats and monkeys) (Murray et al., 1992; Cabirac et al., 1994). Recently 
it was shown that bgp and CEA molecules of human origin could serve 
as receptors for some MHV strains (Chen et al., 1997). 

The expression of the receptor molecules on the cell surface is neces- 
sary for virus infection, and the expression level of the receptor may 
determine the relative susceptibility or resistance to viral infection in 
some cells. During persistent viral infection of cultured murine cells, 
the expression level of the receptor is often reduced, resulting in the 
relative resistance of the cells to viral superinfection, which could be 
overcome by the expression of an exogenous receptor (Sawicki et al., 
1995; Chen and Baric, 1996). Thus, there is a rough correlation between 
receptor expression and the susceptibility of a cell type to virus infec- 
tion. Under certain circumstances, virus may infect cells by a receptor- 
independent mechanism; for example, MHV-infected murine cells may 
fuse with human cells, which do not have MHV receptors, and cause 
the latter cells to become infected (Gallagher et al., 1992). It has been 
shown that MHV infects polarized epithelial cells through the apical, 
but not the basolateral, surface (Rossen et al., 1995a, 1997). It is not 
clear whether the virus receptor is differentially expressed on the two 
different surfaces. 

b. Receptors for TGEV and HCV-229E. The receptors for TGEV and 
HCV-229E have been identified as aminopeptidase N (APN) of the 
porcine and human species, respectively (Delmas et al., 1992; Yeager 
et al., 1992). PRCV also uses porcine APN as a receptor; thus, virus 
binding to the receptor is not sufficient to explain the differences in 
tissue tropism between TGEV and PRCV. APN is a member of the 
membrane-bound metallopeptidase family and is widely distributed on 
diverse cell types; it is highly expressed on the brush border membrane 
of enterocytes. Some of the antibodies against human APN can block 
HCV-229E binding (Yeager et al., 1992); however, the catalytic site of 
the protease activity of APN is not required for receptor function, and 
the inhibitors of APN do not block viral infection (Delmas et al., 1994). 
Similar to MHV, TGEV infects polarized cells through the apical, but 
not the basolateral, surface (Rossen et al., 1994). Again, it is not clear 
whether this is restricted by the differential expression of APN on the 
different sides of the cells. 

TGEV has also been shown to bind to a 200-kDa protein on the 
surface of the enterocytes on the villi of the small intestine (Weingartl 
and Derbyshire, 1994). PCRV does not bind to this molecule. Both the 
temporal expression (mainly in the newborn) and spatial distribution 
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patterns (on the villi of the gastrointestinal tract) of the 200-kDa protein 
correspond to the pattern of susceptibility of piglets to TGEV infection. 
Thus, the expression pattern of this molecule appears to have a better 
correlation than the porcine APN with the tissue tropism of TGEV. 
This 200-kDa molecule may be an alternative receptor used by TGEV. 
The relative functional significance of this molecule and APN as TGEV 
receptors is not yet clear. 

The FIPV strains of FCV and canine coronavirus apparently utilize 
the APN of feline and canine species, respectively, as receptors 
(Benbacer et al., 1997). Cross-species utilization of feline APN by coro- 
naviruses of different species (canine, feline, and human) has also been 
reported (Tresnan e¢ al., 1996). FIPV, however, is unique among corona- 
viruses in that it causes an antibody-dependent enhancement (ADE) 
phenomenon (Weiss and Scott, 1981), which is the result of the binding 
of the virus—antibody complex to Fc receptors on the surface of macro- 
phages, leading to enhanced virus uptake and spread. This ADE 
phenomenon has been attributed to the S protein—antibody complex 
(Vennema et al., 1990b; Corapi et al., 1992; Olsen et al., 1992). The Fc 
receptor may be a co-factor or an alternative receptor for FIPV entry 
into macrophages. In this regard, the S protein of MHV has been shown 
to have limited sequence homology with the murine Fc receptor and 
to have the ability to bind to the Fc fragment of immunoglobulin 
(Oleszak and Leibowitz, 1990; Oleszak et al., 1992). Whether the Fc 
receptor plays a role in MHV infection is not clear. However, MHV 
does not exhibit ADE. 

c. Receptors for Other Coronaviruses. Sialic acid (N-acetyl-9-O- 
acetylneuraminic acid)-containing glycoproteins are probably a compo- 
nent of the cell surface molecules required for BCV and HCV-0C43 
infection because the removal of sialic acids inhibits BCV infection and 
resialylation restores virus infectivity (Schultze and Herrler, 1992); 
however, it is unlikely that it is the primary receptor molecule used 
by these viruses since the distribution of these molecules is more wide- 
spread than the susceptible target cells. The identity of the specific 
receptor for these viruses has not been determined. For HCV-OC43, it 
has been shown that the virus binds to a major histocompatibility 
complex class I molecule (Collins, 1994). However, the receptor function 
of this molecule has not been established. 


C. Penetration and Uncoating 


The mechanism of coronavirus entry into target cells has been contro- 
versial. Early electron microscopic studies visualized virus (MHV and 
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IBV) particles inside lysosome-like vesicles near plasma membranes, 
suggesting that virus enters cells by endocytosis (“viropexis”) (David- 
Ferreira and Manaker, 1965); however, other studies suggested that 
virus enters cells by direct fusion between virions and the plasma 
membrane (Doughriet al., 1976). Lysosomotropic drugs, such as ammo- 
nium chloride and chloroquine, inhibited MHV-3 virus entry (Krzystyn- 
iak and Dupuy, 1984). Also, MHV-specific antibodies did not lyse virus- 
infected cells during the virus-entry process, as would be the case if 
the virus fused with the cell membrane (Krzystyniak and Dupuy, 1984). 
These results suggested that MHV-3 enters cells by an endocytotic 
pathway. Similar studies using the A59 strain of MHV, however, 
showed that ammonium chloride delayed, but did not inhibit, the viral 
infection of L-2 cells (Mizzen et al., 1985). The effects of ammonium 
chloride were interpreted to be inhibiting virus uncoating in this case. 
Recent studies by the same group have further shown that only a small 
proportion of adsorbed virus enters cells by the endocytotic pathway 
since ammonium chloride, chloroquine, and dansylcadaverine, all of 
which inhibit receptor-mediated endocytosis, did not have significant 
effects on MHV entry (Kooi e¢ a/., 1991). The majority of MHV particles 
enter cells by virus—cell fusion at the plasma membrane. This interpre- 
tation is consistent with the finding that the optimum pH for MHV- 
induced cell fusion is 7.4 (Weismiller et al., 1990; Kooi et al., 1991), 
rather than the acidic pH expected for a virus that enters cells by an 
endocytotic pathway (e.g., vesicular stomatitis virus). The optimum pH 
for BCV- and IBV-induced cell fusion is also neutral (Payne and Storz, 
1988; Li and Cavanagh, 1992). These findings suggest that coronavirus 
enters cells by virus—cell fusion at the plasma membrane. On the other 
hand, virus internalization by endocytosis may be a nonproductive 
mechanism which does not depend on virus-receptor interaction, since 
some MHV-resistant cell lines can internalize MHV particles as effi- 
ciently as susceptible cell lines (Kooi et al., 1991). Most surprisingly, 
even Vero cells, an African monkey kidney cell line which presumably 
does not have an MHV receptor, can internalize virus (Kooi et al., 1991). 
Therefore, it is likely that MHV enters cells by both acidic-pH- 
dependent (endocytosis) and -nondependent pathways (Kooi et al., 
1991). The exact mechanism of virus entry may depend on cell types 
and virus strains. Interestingly, an MHV variant which has mutations 
in the S protein has an acidic optimum pH of 5.5-6.0, in contrast to 
the pH of 7.5 for the parental virus (Gallagher et a/., 1991). This virus 
variant probably enters cells by an endocytic pathway, a fact supported 
by the finding that infection of this variant virus is inhibited by ammo- 
nium chloride or chloroquine. 
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What triggers virus internalization after virus—receptor binding is 
not clear. It has been shown that a conformational change in the S 
protein could be induced at pH 8.0 and incubation at 37°C (Sturman et 
al., 1990). Whether this represents the expected conformational change 
following virus—receptor binding is not clear. Irrespective of the mecha- 
nism of virus internalization, fusion between the viral envelope and 
cell membrane must occur, either at the cell surface or in the endosome, 
for viral infection to take place. Virus-induced cell—cell fusion has been 
used to investigate the ability of a virus to induce fusion. Early studies 
with MHV indicated that virus-induced fusion from without (caused 
by virions at the cell surface) or fusion from within (caused by de novo 
synthesized S protein on the cell surface) required cleavage of the S 
protein (Frana et al., 1985; Sturman et al., 1985). Work on BCV sup- 
ported this view (Payne and Storz, 1988; Storz et al., 1991). However, 
more recent experiments involving the expression of S protein (de Groot 
et al., 1989; Stauber et al., 1993; Taguchi, 1993) and studies of MHV 
fusion mutants (Gombold et al., 1993) have indicated that uncleaved 
S can cause syncytium formation, though less efficiently than the 
cleaved S. Of course, coronaviruses such as TGEV, which have no 
cleaved S protein, are infectious, in fact, highly so. Since fusion of the 
virion envelope with a cell membrane is an essential part of the infection 
process, these results suggest that TGEV must be able to cause virus- 
cell fusion. Thus, virus-cell fusion and cell-cell fusion may have differ- 
ent requirements, and, for at least some coronaviruses, S cleavage is not 
required for the fusion of a virion with a cell membrane. Nevertheless, 
cleaved S may be more efficient at inducing fusion for some coronavi- 
ruses. The concentration of S at the surface of a virion may be higher 
than at the cell surface, such that even the uncleaved S can induce 
virion—cell fusion, even though it cannot cause cell-cell fusion. Virus— 
receptor interaction may also trigger a signal transduction pathway to 
facilitate the internalization of the virus—receptor complex. One study 
showed that tyrosine kinase is activated in macrophages immediately 
following MHV-3 infection (Dackiw et al., 1995). It is not yet known 
whether this is required for virus entry. 

The mechanism of virus uncoating, i.e., the release of virion RNA 
from the nucleocapsid, after the virus has been internalized remains 
unclear. One study suggested that virus uncoating may involve an 
endosomal neutral phosphatase, which preferentially dephosphory- 
lates the nucleocapsid protein (Mohandas and Dales, 1991). Further- 
more, while immature oligodendrocytes were sensitive to JHM virus 
infection, differentiated oligodendrocytes were resistant, probably due 
to a block in virion uncoating (Beushausen e¢ al., 1987). The factors 
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responsible for the differences in these two types of cells may involve 
protein kinases (Wilson et al., 1990). Additional cellular factors may 
be required for viral penetration and uncoating. Various murine cell 
lines, all of which express virus receptors, show different degrees of 
susceptibility to infection by different MHV strains (Kooi et al., 1988; 
Asanaka and Lai, 1993; Yokomori et al., 1993). Cell—cell and virus—cell 
fusion studies indicated that virus infection is blocked at different 
stages of virus entry, including penetration and uncoating, in different 
cell lines (van Dinter and Flintoff, 1987; Asanaka and Lai, 1993). These 
cell lines could be grouped into at least three complementation groups 
with respect to the virus entry process (Flintoff, 1984; Asanaka and 
Lai, 1993). Thus, virus penetration and uncoating appear to require 
separate cellular factors. It has been suggested from the studies using 
recombinant viruses between the A59 and JHM strains of MHV that 
viral S protein may interact with these cellular factors (Yokomori et 
al., 1993). The nature of these factors is not yet clear. 


D. Primary Translation 


Following virus uncoating, the first macromolecular synthetic event 
is predicted to be the synthesis of an RNA-dependent RNA polymer- 
ase(s) from the incoming viral genomic RNA, as is the case for all 
positive-strand RNA viruses. The polymerase is translated from gene 
1 at the 5’ end of the genomic RNA, most likely directly from the 
incoming genomic RNA. The process of primary translation has not 
been observed experimentally. However, inhibitors of protein synthesis 
applied early in the infection blocked RNA transcription (Mahy et al., 
1983; Perlman e¢ al., 1986; Sawicki and Sawicki, 1986), indicating that 
protein synthesis, most likely the translation of a viral polymerase, is 
necessary for viral RNA synthesis. This virus-specific polymerase is 
responsible for the synthesis of negative-strand RNA from the incoming 
genomic RNA and subsequent transcription of mRNAs from the 
negative-strand template. The nature of polymerase is discussed in 
Section IV,A. 

Since the genomic-sized RNA is used for both packaging into virus 
particles to become virion RNA and as an mRNA for protein translation, 
the distinction between RNA transcription and RNA replication is often 
blurred. In this review, we will use the term “transcription” to describe 
the synthesis of subgenomic mRNAs as well as genomic RNA used for 
translation; the term “replication” will be used to describe the synthesis 
of the genomic RNA destined to be packaged into virions. 
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E. Transcription of Viral mRNAs 


Coronavirus RNA synthesis occurs via an RNA-dependent RNA tran- 
scription process; thus, RNA synthesis can occur in the presence of 
actinomycin D (with the exception of some coronaviruses, as discussed 
in Section V,A). The majority of the virus-specific RNAs in the cells 
are mRNAs, which are transcribed from a negative-strand RNA tem- 
plate. For clarity of discussion, the structure of the mRNAs will be 
discussed first. 


1. The Structure of mRNAs 


Coronavirus mRNAs consist of six to eight species of different sizes, 
depending on the coronavirus species and strains (Lai, 1990). The 
largest mRNA is equivalent to the genomic RNA, and the remainder 
are subgenomic in size. These RNAs are designated mRNAs 1 through 
7, in order of decreasing size, according to the recommendations of 
the Coronavirus Study Group in 1989 (Cavanagh e¢ al., 1990). Some 
mRNAs have been given a hyphenated name, e.g., mRNA 2-1, because 
they were discovered after the original set of mRNAs was named. They 
have a nested-set structure, and all of them contain sequences starting 
at the 3’ terminus and extending to various distances toward the 5’ 
end (Stern and Kennedy, 1980b; Lai et al., 1981; Leibowitz et al., 1981). 
The smallest mRNA contains only the 3’ terminal ORF, while each 
next larger mRNA contains one additional ORF. The structure of the 
mRNAs in relation to the genome structure is shown in Fig. 7. Thus, 
except for the smallest mRNA, all of the mRNAs are structurally poly- 
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Fic 7. The strategy of transcription and translation of coronavirus (MHV) RNA. The 
structural relationship between mRNAs and genomic RNA is shown. The arrows indicate 
the translated portion of each mRNA. Each arrow represents one protein product. 
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cistronic. In general, each ORF in the genome is represented by an 
mRNA, whose sequence starts from a consensus signal upstream of 
the ORF, and only the 5’ most ORF of each mRNA can be translated; 
thus, each mRNA is functionally monocistronic. However, there are 
exceptions: some mRNAs, e.g., mRNA 5 of MHV and mRNA 3 of IBV, 
are translated into two or three proteins by different mechanisms 
(see Section V,G). 

Several additional minor mRNA species have been detected, some 
of which could only be detected by reverse transcription-polymerase 
chain reaction (La Monica et al., 1992; Schaad and Baric, 1993). These 
minor RNAs probably represent RNA transcripts from weak or atypical 
mRNA start signals (see below). Most do not contain a complete ORF 
at the 5’ end; thus, they are probably not functional. Furthermore, in 
MHYV, several mRNAs, e.g., mRNAs 2-1, 2-2, and 3-1, are transcribed 
only in some virus strains (Shieh et al., 1989; La Monica et al., 1992). 
The syntheses of these mRNAs appear to be differentially regulated 
by the sequence at the 5’ end of the viral genome (Shieh e¢ al., 1989; 
La Monica et al., 1992). 

Coronavirus mRNAs have another unique structural feature: their 
5’ ends have a leader sequence of approximately 60-90 nucleotides, 
which is derived from the 5’ end of the genomic RNA (Lai et al., 1982, 
1983, 1984; Spaan e¢ al., 1983). The leader sequences of all the mRNAs 
are identical for a given strain of virus, except for slight variations at 
some of the leader-mRNA fusion sites, and are identical to the sequence 
present at the 5’ end of the genomic RNA. At the mRNA start sites on 
the viral genomic RNA, there is a short stretch of sequence that is 
nearly homologous to the 3’ end of the leader RNA (Budzilowicz et ai., 
1985). This sequence constitutes part of the signal for subgenomic 
mRNA transcription (Makino et al., 1991). Sequence comparison of viral 
genomic and mRNAs suggests that subgenomic mRNAs are derived by 
fusion of the 5’ end genomic RNA sequence (leader) to the mRNA start 
sites on the viral genomic RNA. The mRNA start sites are usually 
located between the genes; hence, they are termed intergenic (IG) se- 
quences. However, some of the IGs may overlap the coding region of 
the upstream gene. The core sequence of the IG for MHV is UCUAAAC 
or a slightly variant form of this sequence at various IG sites (Joo and 
Makino, 1992). Other virus species also have similar IG sequences. 

The leader sequence of MHV ranges in length from 72 to 82 nucleo- 
tides, the variation resulting from the heterogeneity of the 3’ end se- 
quence, which contains two to four copies of a pentanucleotide (UCUAA) 
repeat. The homologous nucleotides (UCUAA) at the 3’ end of the leader 
and IG sites serve as fusion sites for the leader and mRNAs. Some of 
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the MHV mRNAs are heterogeneous, consisting of several subspecies, 
each containing different copy numbers of the UCUAA repeat (Makino 
et al., 1988c). This fact suggests that the fusion between the leader 
RNA and the mRNAs is not very precise. 

The length and sequence of the leader RNA in other coronaviruses 
vary. However, the 3’ end of the leader sequence always contains a 
pentanucleotide UCUAA or a closely related sequence. mRNAs of coro- 
naviruses other than MHV are usually homogeneous in their structure, 
probably a reflection of the fact that leader RNA at the 5’ end of the 
genome and IG sites in these viruses contain only a single copy of the 
the UCUAA-like sequence (Hofmann et al., 1993a). The copy number 
of this pentanucleotide repeat apparently plays an important role in 
the regulation of mRNA transcription. 


2. The Structure of Negative-Strand RNA 


Coronavirus RNA synthesis is mediated by RNA-dependent RNA 
synthesis via a negative-strand RNA intermediate (complementary to 
the genomic RNA). Coronavirus negative-strand RNA represents no 
more than 1-2% of the total intracellular virus-specific RNA (Perlman 
et al., 1986; Sawicki and Sawicki, 1986). Both genome-sized and subgen- 
omic negative-strand RNAs, which correspond in number of species and 
size to those of the virus-specific mRNAs, have been detected (Sethna et 
al., 1989; Hofmann et al., 1990). The relative molar ratios of the various 
subgenomic negative-strand RNA species are comparable to those of 
the positive-strand subgenomic mRNAs. The 5’ end of the negative- 
strand RNA contains poly(U) sequences, which are shorter than the 
poly(A) sequences present on the positive-strand RNAs (Hofmann and 
Brian, 1991). At the 3’ end of the negative-strand RNA is the comple- 
mentary sequence of the leader RNA (anti-leader) (Sethna et al., 1991). 
Structurally speaking, the subgenomic negative-strand RNAs appear 
to be mirror images of the positive-strand subgenomic mRNAs. All of 
the negative-strand RNAs in the infected cells are present in the form 
of double-stranded RNA; no free negative-strand RNA is detected (Perl- 
man et al., 1986). 


3. Kinetics of Viral RNA Synthesis 


In virus-infected cells, virus-specific mRNA synthesis can usually 
be detected a few hours after infection and throughout most of the viral 
replication cycle (Stern and Kennedy, 1980a; Leibowitz et al., 1981; 
Keck et al., 1988a). The molar amounts of the different mRNA species 
vary; smaller mRNAs are generally more abundant than larger ones, 
but this rule does not always hold true. Nevertheless, the relative ratio 
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of different subgenomic mRNA species remains constant throughout, 
suggesting that the synthesis of the various subgenomic mRNA species 
is regulated coordinately. Some viruses may show slight variations 
in the amounts of individual mRNA species present during infection 
(Hiscox et al., 1995a). Later in infection, there appears to be an en- 
hanced synthesis of the genomic-sized RNA (Keck et al., 1988a). 

The kinetics of negative-strand RNA synthesis follows a pattern 
similar to that of positive-strand mRNA synthesis; however, the peak 
of negative-strand RNA synthesis appears to occur earlier than for 
positive-strand RNA (Perlman et al., 1986; Sawicki and Sawicki, 1986). 
Thereafter, negative-strand RNA synthesis drops significantly, in con- 
trast to that of positive-strand RNA synthesis, and negative-strand 
RNA appears to be stable (Perlman et al., 1986; Sawicki and Sawicki, 
1986). A similar pattern of kinetics of negative-strand RNA synthesis 
is also seen in the accumulation of the negative-strand RNA of a DI 
RNA, which very rapidly reaches a steady-state level after transfection 
(Lin et al., 1994). Therefore, the negative-strand RNA probably func- 
tions as a template for multiple rounds of positive-strand RNA synthe- 
sis. This conclusion is supported by the study of a ts mutant defective 
in negative-strand RNA synthesis (Schaad and Baric, 1994). However, 
the ability to synthesize negative-strand RNA seems to be maintained 
throughout the viral life cycle, as evidenced by the finding that a trans- 
fected DI RNA can replicate even when transfected late in the infection 
(Jeong and Makino, 1992). 


4, Mechanism of mRNA Synthesis 


Since all subgenomic RNAs consist of a leader RNA derived from 
the 5’ end of the genome and a body sequence derived from various 
downstream sequences, they must be synthesized by fusion of two 
discontiguous sequences either during or after transcription. An early 
study showed that the leader sequence of each mRNA can be exchanged 
freely between two coinfecting viruses, suggesting that the leader RNA 
and mRNAs are transcribed independently and can conjoin in a random 
fashion (Makino et al., 1986b). More recent studies using DI RNA 
constructs that contain an inserted mRNA start signal (see below) 
established that the leader RNA and mRNAs are usually derived from 
two separate RNA molecules (Jeong and Makino, 1994; Zhang et al., 
1994b). These studies unequivocally showed that coronaviral mRNA 
synthesis is carried out by either a discontinuous transcription or trans- 
splicing process, which fuses sequences from two different RNA mole- 
cules. Several transcription models have been proposed, each of which 
is consistent with some of the experimental data. These models are 
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not mutually exclusive, as components of each model may operate at 
different stages of the viral replication cycle. Before presenting these 
models, we will discuss several findings pertinent to coronaviral RNA 
transcription. 


1. Coronavirus replication takes place entirely in the cytoplasm. Nu- 
clear functions are believed not to be required for RNA synthesis 
(Brayton et al., 1981; Wilhelmsen et al., 1981); thus, viral RNA 
transcription does not involve the conventional RNA splicing ma- 
chinery present in the nucleus. 

2. Early ultraviolet (UV) transcriptional mapping studies indicated 
that in the late stage of viral replication, the UV target size of each 
subgenomic and genomic mRNA is approximately equivalent to the 
physical size of the respective mRNA (Jacobs e¢ al., 1981; Stern 
and Sefton, 1982a); thus, each mRNA is transcribed independently 
rather than derived by the processing of a large precursor RNA. 
However, early in infection, the UV target sizes of the subgenomic 
mRNAs were found to be equivalent to that of the genomic RNA 
(Yokomori et al., 1992b); thus, at least early in infection, the synthe- 
sis of a genomic-length RNA is required for subgenomic mRNA syn- 
thesis, although it is not clear whether this requirement is for a 
positive- or a negative-stranded, full-length RNA. A more recent 
analysis of the UV target sizes of subgenomic mRNAs of MHV sug- 
gested that, even late in the infection, the UV target sizes of some 
subgenomic mRNAs are slightly larger than their physical lengths 
but smaller than genomic size (den Boon et al., 1995). Similar obser- 
vations were made for equine arteritis virus (an arterivirus). This 
recent result is consistent with either of two interpretations: (a) the 
subgenomic mRNAs are derived from a slightly longer RNA template 
or (b) they are derived from a mixture of templates of different sizes 
(genomic as well as subgenomic). The difference in UV target size 
between the early and late stages of viral RNA replication suggests 
that different mechanisms of RNA synthesis may operate at the 
different stages of the viral replication cycle. 

3. The molar ratios of different subgenomic mRNA species and those 
of subgenomic negative-strand RNAs are similar (Sethna et al., 1989; 
Hofmann e¢ al., 1990), suggesting that subgenomic mRNAs and 
subgenomic negative-strand RNAs are derived from each other or 
under the same transcriptional regulation. 

4. The leader RNA at the 5’ end of each mRNA is identical in each 
mRNA and to the leader RNA at the 5’ end of genomic RNA. Further- 
more, there is sequence homology between the 3’ end of the leader 
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RNA and the mRNA start sites on the genomic RNA (Budzilowicz 
et al., 1985), where the leader sequence is fused to the mRNAs. 
There is some sequence divergence between the leader RNA and 
some of the mRNA start sites; in these cases, the leader RNA of the 
resulting mRNAs usually mimics the sequence of the mRNA start 
site rather than the leader at the 5’ end of the genome. This finding 
was used to suggest the possible presence of RNA proofreading activ- 
ity during coronavirus transcription (Lai, 1986, 1990; van der Most 
et al., 1994). 


The following transcriptional models (Fig. 8) address the possible 
mechanism of fusion between the leader sequence and mRNAs. Most 
of the experimental evidence came from MHV studies. The exceptions 
will be noted. 

a. Leader-Primed Transcription. This model proposes that the vi- 
rion genomic RNA is first transcribed into a genomic-length, negative- 
strand RNA, which, in turn, becomes the template for subsequent sub- 
genomic mRNA synthesis. The leader RNA is transcribed from the 3’ 
end of the negative-strand RNA and dissociated from the template. 
The free RNA subsequently associates with the template RNA at vari- 
ous mRNA start sites and serves as a primer for transcription of 
mRNAs. It is proposed that the discontinuous transcription step takes 
place during positive-strand RNA synthesis. Several pieces of evidence 
are compatible with this model: 


1. Several leader RNAs of approximately 50-90 nucleotides have been 
detected in the cytoplasm of MHV-infected cells (Baric et al., 1985). 
Some of these are dissociated from the template RNA and, thus, 
may serve as a potential source of primers in this transcription 
model. These RNAs have distinct sizes which are reproducible from 
cell to cell (Baric et al., 1987); however, they are not exactly the 
same size as the leader sequence present in the subgenomic mRNAs. 
Thus, these free leader RNAs must be processed before they are 
incorporated into mRNAs. 

2. A temperature-sensitive mutant of MHV, which synthesizes leader 
RNA but not mRNAs at the nonpermissive temperature, has been 
isolated (Baric et al., 1985). The isolation of this mutant suggests 
that MHV mRNA synthesis is discontinuous, requiring different 
viral proteins for the synthesis of leader RNA and mRNAs. Thus, a 
distinction can be made between leader RNA synthesis and 
mRNA synthesis. 

3. During mixed infections with two different MHV strains, as much 
as 40-50% of the leader sequence on the subgenomic mRNAs of one 
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Fic 8. Proposed models of coronavirus mRNA transcription. The solid lines represent 
positive-strand RNA, the broken lines negative-strand RNA. Boxes represent the 
leader RNA. 


of the viruses is derived from the other coinfecting virus (Makino 
et al., 1986b). This result suggests that the leader sequence and 
body sequence of each mRNA are derived from two separate pools. 
This phenomenon is reminiscent of the RNA reassortment that oc- 
curs in RNA viruses with segmented RNA genomes. This result is 
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best explained by the possibility that free leader RNAs participate 
in viral RNA synthesis. 

4. In an in vitro transcription system utilizing cytoplasmic extracts 
from MHV-infected cells, exogenous leader RNAs can be utilized for 
mRNA synthesis (Baker and Lai, 1990). The exogenous leader RNA 
was incorporated into the subgenomic mRNAs at a site that matched 
precisely that of the endogenous leader RNA present in the viral 
subgenomic mRNAs, regardless of the length of the exogenous leader 
RNA used, suggesting that the exogenous leader RNA sequence was 
processed before being incorporated into mRNAs. Furthermore, the 
truncated leader RNA which lacked the 3’ end UCUAA sequence 
could not be incorporated into mRNAs, suggesting the importance 
of this sequence in transcription (Baker and Lai, 1990). 

5. The leader RNA sequence, specifically the copy number of the 
UCUAA repeats at the 3’ end of the leader RNA, can affect the 
transcription of some viral subgenomic mRNAs. For example, 
whereas an MHV strain containing two UCUAA repeats transcribes 
mRNA 2-1, a strain with three UCUAA repeats does not, despite 
identical sequences in the mRNA start sites of these two viruses 
(Shieh et al., 1989; Yokomori et al., 1991; La Monica et al., 1992). 
This finding suggests that the leader RNA plays an essential role 
in the regulation of mRNA transcription. 


According to this model, the free leader RNA binds to the mRNA 
start site (IG) of the full-length negative-strand template via the com- 
plementary sequences between the 3’ end of the leader (positive-strand) 
and the IG site of the template RNA (negative-strand) and serves as 
the primer for RNA transcription. The free leader RNA (primer) may 
be longer than the leader sequence in the subgenomic mRNAs. There 
are certain mismatched nucleotides between the leader and template 
at some mRNA start sites; in the latter case, sequences in the mature 
mRNAs usually match those of the template instead of the leader. 
Therefore, the free leader RNA probably undergoes 3’ end cleavage 
before transcription starts to remove the leader nucleotides that are 
not complementary to the template RNA (Lai, 1986, 1990; van der Most 
et al., 1994). Transcription is then initiated from the 3’ end of the 
processed leader RNA. 

This model is consistent with most of the sequence data of mRNAs. 
It also explains the curious finding that some mRNAs of MHV are 
heterogeneous in the copy number (from two to four) of the pentanucleo- 
tide (UCUAA) repeats at the leader-mRNA fusion site (Makino et al., 
1988c). This heterogeneity is best explained by the imprecise binding 
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between the leader RNA and template RNA due to the presence of 
multiple copies of UCUAA (Lai, 1990). Indeed, BCV, which contains 
only one copy of UCUAA in both the 5’ leader and IG sites, does not 
show this type of heterogeneity in its mRNAs (Hofmann e¢ al., 1993a). 

Some recent data, however, cannot be explained by this RNA 
sequence-homology-driven transcription model. A particular MHV 
strain (MHV-2C), which has four copies of the UCUAA in the leader 
RNA, synthesizes some subgenomic mRNAs that are very heteroge- 
neous in length and in leader-mRNA fusion sites (Zhang and Lai, 
1994). The sequence data of its mRNAs showed that the leader RNA 
of this virus is randomly fused to sites where no sequence homology 
exists between the leader and fusion sites (Zhang et al., 1994b). A 
similar though less conspicuous heterogeneity in the leader-mRNA 
fusion sites has also been observed in another MHV strain in a DI 
RNA-based transcription system (see Section V,E,5) (van der Most et 
al., 1994). These findings suggest that the sequence complementarity 
between the leader RNA and IG sites may not be the driving force for 
mRNA transcription. Thus, a modified version of the leader-primed 
transcription model proposes that the UCUAAAC sequence provides a 
recognition signal for viral polymerases and viral or cellular transcrip- 
tion factors. These proteins bind to the leader and IG sites of the 
template RNA, and the subsequent RNA-protein and protein-protein 
interactions result in the formation of a transcription complex to initiate 
mRNA transcription and effect leader-mRNA fusion (Lai et al., 1994; 
Zhang and Lai, 1995). 

The salient feature of this model is that the discontinuous transcrip- 
tion step occurs during positive-strand RNA synthesis; thus, transcrip- 
tional regulation is exerted mainly during positive-strand RNA synthe- 
sis. This is consistent with current knowledge of the regulation of MHV 
RNA synthesis. It has been shown that MHV mRNA transcription 
requires multiple cis-acting RNA sequences (see Section V,E,5). In 
contrast, the initiation of negative-strand RNA synthesis requires only 
the 3’ end 55-nt plus poly(A) (Lin et al., 1994). Thus, most of the 
regulatory elements appear to regulate positive-strand RNA synthesis. 
Since the free leader RNA is the centerpiece of this transcription model, 
it readily explains why the leader RNA from a different virus can be 
utilized freely in trans during mixed infections (Makino et al., 1986b). 
However, this model does not explain the finding that subgenomic 
replicative-intermediates (RI) and replicative-form (RF) RNAs were 
detected and were functional during viral RNA synthesis (Sawicki and 
Sawicki, 1990; Schaad and Baric, 1994) (see Section V,E,4,b). It is 
possible that the subgenomic mRNAs synthesized can be transcribed 
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into subgenomic negative-strand RNAs, which, in turn, become the 
templates for mRNA transcription at a later stage in the viral replica- 
tion cycle. This would explain why the UV target sizes for mRNAs are 
nearly equivalent to the physical sizes of mRNAs late in the infection 
and yet are equivalent to the genomic-sized RNA early in the infection 
(Yokomori et al., 1992b). 

b. Discontinuous Transcription During Negative-Strand RNA Syn- 
thesis. In contrast to the leader-primed transcription model, this model 
proposes that the discontinuous transcription step occurs during 
negative-strand RNA synthesis, generating subgenomic negative- 
strand RNAs, which then serve as templates for subgenomic mRNAs 
in uninterrupted transcription. This model was proposed to account 
for the detection of subgenomic negative-strand RNAs (Sethna et al., 
1989; Hofmann et al., 1990) and subgenomic RIs (Sawicki and Sawicki, 
1990) in virus-infected cells. In this model, IG (mRNA start site) se- 
quences on the genomic RNA serve as termination or pausing signals 
for negative-strand synthesis (Konings et al., 1988), and the nascent 
subgenomic negative-strand RNA then jumps to the leader RNA se- 
quence at the 5’ end of the genomic RNA by an unknown mechanism 
to continue RNA synthesis. As a result, the nascent negative-strand 
subgenomic RNA fuses with the negative-strand leader sequence, gen- 
erating a subgenomic negative-strand RNA that contains an anti-leader 
sequence at its 3’ end and a poly(U) sequence at its 5’ end (Hofmann 
and Brian, 1991; Sethna e¢ al., 1991). Structurally, these negative- 
strand RNAs are mirror images of the subgenomic mRNAs and, thus, 
can potentially serve as a template for uninterrupted transcription of 
subgenomic mRNAs. 

In this model, the regulation of subgenomic mRNA transcription 
would be exerted on negative-strand instead of positive-strand RNA 
synthesis. This model is consistent with the following observations: 


1. Subgenomic negative-strand RNAs have been detected in virus- 
infected cells (Sethna et al., 1989; Hofmann e¢ al., 1990). These RNAs 
have structures that are mirror images of those of the completed 
subgenomic mRNAs. The relative molar ratios of the different sub- 
genomic negative-strand RNAs are similar to those of the corres- 
ponding viral mRNAs (Sethna et al., 1989; Hofmann et al., 1990). 

2. Subgenomic RI RNAs have been detected in virus-infected cells later 
in the infection (Sawicki and Sawicki, 1990). The smaller RIs were 
precursors of the smaller mRNAs and the larger Ris generated the 
larger mRNAs, suggesting that each subgenomic mRNA was tran- 
scribed from the corresponding subgenomic-sized negative-strand 
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template (Sawicki and Sawicki, 1990). Another study, which ana- 
lyzed the subgenomic RFs of a temperature-sensitive mutant of 
MHYV, also suggested that subgenomic negative-strand RNAs are 
functional (Schaad and Baric, 1994); although, in this study, RIs 
were not directly examined. 

3. The UV targets for subgenomic mRNA synthesis at the later stage 
of viral replication are subgenomic in length (Jacobs et al., 1981; 
Stern and Sefton, 1982a; Yokomori et al., 1992b), roughly corre- 
sponding to the physical lengths of each subgenomic mRNA, suggest- 
ing that the templates for these mRNAs are subgenomic. 

4. In DI RNA systems (see Section V,E,5), when multiple IG sequences 
were present, the sequences in the 3’ end often had a higher tran- 
scription efficiency than those at the 5’ end, consistent with the 
proposal that IGs serve as transcriptional termination sites, which 
impede the elongation of the negative-strand RNAs (Van Marle et 
al., 1995; Krishnan et al., 1996). However, in some cases, the higher 
transcription efficiency of the 3’ proximal IG was observed only 
when the neighboring IGs were very close together, suggesting a 
spatial constraint rather than sequential interference (Joo and 
Makino, 1995). 


This model, however, cannot explain why the UV targets for subgeno- 
mic mRNA synthesis early in infection are of genomic size (Yokomori 
et al., 1992b) and why, later in the infection, the targets for these same 
mRNAs are still larger than the respective subgenomic mRNAs but 
not longer than genomic size (den Boon et al., 1995). It also cannot 
explain why the nature of the leader sequence can regulate differential 
transcription of various mRNA species, such as mRNA 2-1 of MHV, 
inasmuch as the leader sequence on the template RNA is localized 
downstream of the transcription termination site for negative-strand 
RNA synthesis. Finally, it is difficult to explain why the leader RNAs 
are derived in trans. 

c. Trans-Splicing of Nascent RNA Transcripts. This model proposes 
that the full-length positive- or negative-strand RNAs are spliced post- 
transcriptionally to generate subgenomic RNAs. It was initially consid- 
ered unlikely because of the findings that coronavirus replicates in the 
cytoplasm rather than in the nucleus (Brayton e¢ al., 1981; Wilhelmsen 
et al., 1981), where the splicing machinery is present, and that UV 
target sizes of subgenomic mRNAs are equivalent to the physical sizes 
of subgenomic mRNAs (Jacobs e¢ al., 1981). Furthermore, there are no 
consensus splicing donor and acceptor sequences in the coronavirus 
genomic RNAs. However, the trans-splicing model is compatible with 
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recent findings that early in infection, the UV targets for subgenomic 
mRNA synthesis are of genomic length (Yokomori et al., 1992b), and 
that both the leader RNA and IG sequence of MHV negative-strand 
RNA bind to a cellular factor, heterogeneous nuclear RNP (hnRNP) 
Al, which is involved in alternative RNA splicing (Zhang and Lai, 1995; 
H.-P. Liand M. M.C. Lai, unpublished observation). A modified splicing 
model thus can be proposed as follows: a full-length negative-strand 
RNA is first synthesized. Components of the splicing machinery derived 
from the nucleus or cytoplasm then bind to the leader sequence and 
IG sites on the negative-strand RNA and form a splicing complex. The 
leader and IG can be derived from different RNA molecules. Splicing 
between the leader and IG generates a subgenomic negative-strand 
RNA. Once the spliced subgenomic negative-strand RNAs are gener- 
ated, they are used as templates for subsequent mRNA synthesis. Later 
in infection, even the subgenomic negative-strand RNAs may be able 
to participate in RNA splicing to generate smaller subgenomic negative- 
strand RNAs because they themselves also contain the leader and IG 
sequences, This model may thus explain why the UV target for mRNA 
transcription is of genomic length early in infection (Yokomori et al., 
1992b) and may shed light on the recent puzzling finding that later in 
infection, the UV target sizes are still larger than the actual sizes of 
the subgenomic mRNAs (den Boon et al., 1995). It also explains the 
functional roles of subgenomic RIs (Sawicki and Sawicki, 1990). This 
potential splicing, however, must be different from conventional RNA 
splicing because it occurs in the cytoplasm, and the splicing donor and 
acceptor sequences must also be different from the conventional ones. 
Since some of the splicing factors are probably derived from the nucleus, 
this model predicts that nuclear functions are involved in MHV RNA 
transcription. 

d. Amplification of Virion-Associated Subgenomic RNAs. Based on 
the findings that some coronaviruses, including BCV, TGEV, and IBV 
(Sethna et al., 1989; Hofmann et al., 1990; Zhao et al., 1993), contain 
subgenomic mRNAs in the virion, probably as a result of nonspecific 
RNA packaging, it was proposed that these virion-associated subgeno- 
mic mRNAs can be used directly as templates for the synthesis of 
subgenomic negative-strand RNAs, which, in turn, serve as templates 
for the synthesis of additional subgenomic mRNAs (Sethna e¢ al., 1989). 
This model may explain the presence of subgenomic negative-strand 
RNAs and Ris in the infected cells, but it cannot explain the genomic- 
length nature of the UV target sizes for mRNA synthesis early in 
infection (Yokomori et al., 1992b), nor can it explain how leader RNAs 
from different virus strains can be randomly incorporated into mRNAs 
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of a different virus. Furthermore, the virion-associated subgenomic 
mRNAs have not been detected in all coronavirus species. 

The available data cannot unequivocally rule out any of the proposed 
transcription models. The primary difficulty in experimental analysis 
is that once the subgenomic mRNAs are synthesized, by whatever 
mechanism, they are transcribed into negative-strand RNAs because 
the cis-acting signal for negative-strand RNA synthesis in MHV resides 
in the 55 nucleotides at the 3’ end plus poly(A) (Lin et al., 1994), which 
is present in every subgenomic RNA. Thus, it is difficult to separate 
the primary and secondary events of transcription. It is possible that 
these transcription models are not mutually exclusive. For example, 
early in infection, a leader-primed transcription or trans-splicing mech- 
anism may operate, generating subgenomic mRNAs, which are then 
amplified into subgenomic negative-strand RNAs; the latter serve as 
templates for further amplification of subgenomic mRNAs thereafter. 
The subgenomic negative-strand RNA can be used for either unin- 
terrupted transcription or leader-primed transcription to generate 
positive-strand subgenomic RNAs. A combination of these models 
would be consistent with most of the experimental data. This two- 
step model of primary and secondary transcription (Jeong and Makino, 
1992) may explain the apparent differences in the possible mechanism 
of transcription between early and late stages of viral infection. 


5. Cis- and Trans-Acting Signals for Transcription as Revealed by 
DI RNA Vectors 


Because of the large size of coronavirus RNA, no infectious cDNA 
or RNA clones are now available for reverse genetics studies. This 
difficulty has hampered progress in the study of the molecular biology 
of coronaviruses. DI RNAs of several coronaviruses (see Section VI,E) 
have been molecularly cloned and used as a substitute for the genomic 
RNA to study the cis- and trans-acting signals involved in viral RNA 
synthesis. Although natural DI RNAs do not contain an mRNA start 
signal and, consequently, cannot transcribe an mRNA, the insertion 
of such a signal into the DI RNA allows an mRNA to be transcribed 
from the transfected DI RNA in the virus-infected cells, thus enabling 
studies of the regulatory sequences for transcription. 

Following is a summary of information that has been obtained using 
this approach. It should be cautioned, however, that regulation of RNA 
transcription probably depends on overall RNA conformation and that 
the cis-acting sequence required for RNA synthesis very often varies 
with the DI RNA vector used; therefore, the results obtained from DI 
RNA studies may not be directly applicable to the viral genome. A full 
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understanding of the regulation of viral RNA synthesis still awaits the 
development of an infectious cDNA clone. 

The following cis-acting signals for coronavirus RNA transcription 
have been determined primarily from MHV DI RNA studies (with some 
from BCV DI) (Fig. 9). 

a. IG Sequence. The IG sequence can be considered to be the pro- 
moter element for transcription. It also serves as the mRNA start site 
and the site of fusion between the leader RNA and body sequence of 
mRNAs. A seven-nucleotide core sequence, UCUAAAC, is sufficient to 
initiate mRNA synthesis (Makino et al., 1991). Extensive site-specific 
mutagenesis studies have shown that most of the single-nucleotide 
mutations within this core sequence could be tolerated, although the 
transcription efficiency of some of these mutants was lower (Joo and 
Makino, 1992; van der Most e¢ al., 1994). These seven nucleotides 
represent the minimum promoter; deletion of a nucleotide results in 
complete ablation of mRNA transcription. The effects of the sequences 
near the promoter on transcription are contradictory: in certain situa- 
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Fic 9. Cis-acting signals for various steps of MHV DI RNA synthesis. The boxed 
regions represent the cis-acting signals for the indicated steps of RNA synthesis. 
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tions, the nature of the neighboring sequences did not affect transcrip- 
tion (Makino and Joo, 1993), but under other circumstances, it did 
(Jeong et al., 1996). Thus, the strength of the promoter appears to 
depend on the context of the overall RNA sequence and structure. 

The relative flexibility of sequence requirement of the promoter se- 
quence in the DI RNA system appears to differ significantly from that 
seen in the viral genomic RNA. In the MHV genome, there are more 
than 20 stretches of sequence resembling the UCUAAAC sequence, in 
addition to the six promoters for the known subgenomic mRNAs (Joo 
and Makino, 1992). Yet, most of these did not promote mRNA synthesis 
from the viral RNA genome to any appreciable extent, in contrast to 
their ability to promote transcription in the DI RNA vector system 
(Joo and Makino, 1992). In the viral genome, the single-nucleotide 
substitution of a G residue in the core promoter sequence completely 
abolished mRNA synthesis (Shieh e¢ al., 1989), whereas this is tolerated 
in the DI RNA (Joo and Makino, 1992). Thus, there appear to be signifi- 
cant differences between the sequence requirement for mRNA synthe- 
sis in the DI RNA and in the natural viral genomic RNA. When there 
are multiple IG sequences in the DI RNA, the order of the IG sequences 
may influence transcriptional efficiency. An IG located at the 3’ end 
generally has an advantage in initiating mRNA synthesis (Van Marle 
et al., 1995; Krishnan e¢ al., 1996). The sequences near the IGs may 
suppress transcription (Jeong et al., 1996). 

b. The Leader Sequence at the 5' End of the DI RNA. The leader 
sequence at the 5’ end of the viral genomic RNA becomes the leader 
sequence of subgenomic mRNAs; thus, it fills a structural role for mRNA 
synthesis. However, the leader RNA of the subgenomic mRNAs is not 
derived exclusively from the leader RNA of the same (DI) RNA; in fact, 
most are derived in trans from a separate RNA molecule, such as helper 
virus RNA (Jeong and Makino, 1994; Liao and Lai, 1994; Zhang et al., 
1994b). Nevertheless, mRNA transcription from an IG site in the DI 
RNA still requires the presence of a leader RNA sequence at the 5’ 
end of the DI RNA as a cis-acting sequence (Liao and Lai, 1994). 
Deletion of this cis-acting leader abolished transcription. Furthermore, 
the sequence of this leader RNA, particularly its 3’ end sequence, can 
affect the efficiency of transcription from certain IG sequences on the 
DI RNA (Zhang et al., 1994b). For example, the leader RNA containing 
two pentanucleotide (UCUAA) repeats transcribes an mRNA from the 
IG 2-1 site more efficiently than the leader RNA with three UCUAA 
repeats. Thus, the cis-acting leader RNA plays a role similar to that 
of an enhancer. These findings suggest that the leader RNA serves two 
functions (Liao and Lai, 1994): (1) it supplies the leader RNA to the 
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subgenomic mRNAs, and (2) it serves as an enhancer-like sequence to 
regulate transcription. This finding also suggests that there is either 
a direct or an indirect interaction between the leader and IG sequences. 

Some additional sequences downstream of the leader may also en- 
hance transcription from an IG site in the DI RNA (Liao and Lai, 1994); 
however, the precise sequence requirement is not known. This sequence 
requirement shows some virus sequence specificity, since it cannot be 
replaced with other viral RNA sequences (Liao and Lai, 1994). It may 
be needed to maintain overall RNA conformation for the recognition 
of the IG sequence. 

c. The 3' UTR. In an MHV DI RNA construct, partial deletion of 
the 3’ UTR completely abolished transcription from an upstream IG 
site in the DI RNA (Lin e¢ al., 1996). This stretch of 3’ UTR is probably 
involved in positive-strand RNA synthesis, since the length of this 
required sequence (305 nt) is significantly longer than that required 
for negative-strand RNA synthesis (55 nt). The 3’ UTR requirement for 
mRNA transcription is surprising, since positive-strand RNA synthesis 
starts from the 5’ end; thus, the 3’ end sequence is the last to be 
transcribed. This 3' UTR sequence requirement is similar to that for 
RNA replication (Kim et al., 1993b; Lin and Lai, 1993) (see Section 
V,F). This finding suggests that the 3’ end may interact with the 5’ 
end and possibly with IG sequences during transcription. 

d. A Nine-Nucleotide Sequence, UUUAUAAAC, This sequence, lo- 
cated immediately downstream from the UCUAA repeats at the 3’ end 
of the leader RNA in the viral genome, plays a significant role in RNA 
transcription. It is deleted from the genome of one of the MHV strains 
and is often deleted in naturally occurring DI RNAs (Lai e¢ ad., 1987). 
In this particular MHV strain (MHV-2C), the leader-mRNA fusion 
sites are very heterogeneous and do not always occur at the usual 
UCUAAAC sites (Zhang et al., 1994b). This nine-nucleotide sequence 
can serve as an mRNA start signal, allowing transcription of an almost 
genomic-length mRNA (Zhang and Lai, 1996). In the DI RNAs, the 
presence or absence of this nine-nucleotide sequence influences tran- 
scription efficiency from the downstream IG site and, most importantly, 
affects the source of the leader RNA incorporated into subgenomic 
mRNAs (Zhang et al., 1994b). When this nine-nucleotide sequence is 
present, the leader sequence in the subgenomic mRNAs is contributed 
both from the DI RNA in cis and from helper virus RNA in trans. When 
this sequence is missing, the leader RNA is derived exclusively from 
the helper virus RNA (Zhang et al., 1994b). Thus, this nine-nucleotide 
sequence appears to regulate the mechanism by which the leader RNA 
is fused to the subgenomic mRNAs. 
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These results combined suggest that multiple RNA regions are in- 
volved in the regulation of mRNA transcription. However, a recent 
study appears to contradict the need of cis-acting sequences other than 
the IGs for mRNA transcription. When a negative-strand RNA contain- 
ing only an IG sequence of TGEV and a reporter gene was transcribed 
in situ from a transfected cDNA by using a recombinant vaccinia 
virus-T7 RNA polymerase expression system, this RNA was tran- 
scribed in the presence of TGEV, generating an mRNA with a correctly 
fused TGEV leader sequence (Hiscox et al., 1995b). The leader- 
containing mRNA could have been generated by either of the transcrip- 
tion mechanisms described (Section V,E,4,a or Section V,E,4,b) above. 
This study suggests that this negative-strand IG site is sufficient for 
transcription. However, it is possible that this activity represents a 
basal level of transcription and that other cis-acting sequences may 
enhance the efficiency of transcription. 


6. Proteins Involved in RNA Synthesis 


The application of inhibitors of protein synthesis at any time during 
the viral life cycle inhibits viral RNA synthesis, suggesting that contin- 
uous protein synthesis is required for RNA synthesis (Perlman e¢ al., 
1986; Sawicki and Sawicki, 1986). A similar observation has been made 
using an inhibitor of cysteine proteases, which inhibits a specific step 
of the processing of gene 1a products of MHV (Kim et al., 1995) (see 
Section V,G,2), suggesting that continuous production of polymerase 
is required for viral RNA synthesis. The precise nature of the viral 
proteins involved has yet to be determined. Temperature-sensitive mu- 
tants of MHV that are defective in RNA synthesis at the nonpermissive 
temperature have been divided into at least five complementation 
groups, indicating that at least five proteins are involved in viral RNA 
synthesis (Leibowitz et al., 1982a; Baric et al., 1990) (see Fig. 6). All 
of these complementation groups are mapped within the gene 1 region 
(including both 1a and 1b). Sequence analysis showed that gene 1b 
contains an RNA polymerase motif (Gorbalenya et al., 1989b; Lee et 
al., 1991). Polymerase activities have been demonstrated in membrane 
fractions of BCV- and MHV-infected cells (Brayton et al., 1982, 1984; 
Dennis and Brian, 1982), and several in vitro RNA synthesis systems 
have been reported (Compton et al., 1987; Leibowitz and DeVries, 1988; 
Baker and Lai, 1990); however, the nature of polymerases in these 
systems has not been identified. In one study, it was demonstrated 
that the antibodies against the N protein could inhibit RNA synthesis, 
suggesting that N protein may be involved in RNA synthesis (Compton 
et al., 1987). 
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In addition to viral proteins, cellular factors may also be involved 
in RNA synthesis. Several cellular proteins have been shown to bind 
to the regulatory elements of MHV RNA, including the 5’ and 3’ ends 
of the genomic RNA and the 3’ end of the negative-strand RNA and 
IG sites (Furuya and Lai, 1993; Yu and Leibowitz, 1995a,b; Zhang and 
Lai, 1995). The binding sites for the cellular proteins at the 5’ end of 
genomic RNA and the 3’ end of negative-strand RNA are complemen- 
tary (Furuya and Lai, 1993). The protein p35, which binds to the 
negative-strand leader sequence and the IG site, is particularly inter- 
esting. Site-specific mutations of the IG site affected the binding of this 
protein and the efficiency of RNA transcription to the same extent, 
suggesting that the binding of this protein is required for RNA tran- 
scription (Zhang and Lai, 1995). This protein recently has been identi- 
fied as hnRNP A1 (H.-P. Li and M. M. C. Lai, unpublished observation). 
The mutations at the 3’ end of the viral genomic RNA that abolished 
the binding of cellular proteins also inhibited both negative-strand 
and positive-strand RNA synthesis, although the correlation between 
protein binding and RNA replication was not absolute (Yu and Leibo- 
witz, 1995a). Thus, cellular proteins probably play a significant role in 
viral RNA replication and transcription. Curiously, viral proteins in 
the infected cell extract could not be cross-linked to the viral RNA in 
vitro, suggesting that viral proteins may interact with viral RNA only 
indirectly through cellular proteins. This is in contrast to the finding 
that the purified N protein can bind to the leader RNA sequence in 
vitro (Baric et al., 1988; Stohlman e¢ al., 1988). The reason for this 
discrepancy is not clear. 


F. Replication of Viral Genomic RNA 


The genomic-sized RNA in coronavirus-infected cells theoretically 
consists of two populations: the messenger RNA (mRNA 1), which is 
translated to yield gene 1a and 1b products, and the genomic RNA, 
which is destined to be packaged into virion. Early studies demon- 
strated that, late in the infection, most (95%) of the genomic-sized 
RNA in the cells was associated with the viral nucleocapsid, while the 
remainder (5%) was present in polysomes (Spaan et a/., 1981; Perlman 
et al., 1986). Presumably, early in infection, most of the genomic-sized 
RNA would be associated with polysomes to serve as mRNAs for the 
synthesis of polymerase; however, this has not been demonstrated. It 
is not clear whether there is any difference in structure and mechanism 
of synthesis between these two RNA populations. 
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Since genomic RNA requires uninterrupted synthesis from the full- 
length template, whereas mRNAs involve discontinuous transcription, 
the two types of genomic-sized RNA (mRNA 1 and virion genome RNA) 
may be synthesized by two different mechanisms. A recent study sug- 
gests that at least some of the MHV genomic-sized RNAs are indeed 
synthesized by a discontinuous transcription, using the UCUAA repeat 
in the leader RNA and the nine-nucleotide UUUAUAAAC immediately 
downstream of the leader RNA as the transcription start site (Zhang 
and Lai, 1996). This raised the possibility that mRNA 1 and virion 
genomic RNA are distinguishable. However, it cannot be inferred from 
this study that the fate of the genomic-sized RNA products derived 
from discontinuous transcription is different from the fate of those 
derived from uninterrupted RNA synthesis. 

The possible involvement of discontinuous transcription in generat- 
ing genomic-sized RNA may explain several interesting findings re- 
garding MHV genomic RNA: 


1. The copy number of the UCUAA repeat in the leader sequence of 
the genomic RNA, which ranges from two to four copies in different 
MHV strains, rapidly evolves during virus passage (Makino and 
Lai, 1989a; La Monica e¢ al., 1992). Starting with a pure virus 
population, the copy number in the viral genomic RNA rapidly be- 
comes heterogeneous during serial passages in tissue culture, and 
a new virus population with a different copy number of UCUAA 
repeats emerges (Makino and Lai, 1989a). Since this sequence varia- 
tion is seen in the leader region but not in the IG regions, where 
uninterrupted RNA synthesis probably occurs, this finding is best 
explained by the discontinuous transcription mechanism involving 
the 5’ leader region. The imprecise fusion of the leader RNA to the 
mRNA start sites would result in heterogeneity of the copy number 
of the UCUAA repeats (Makino et al., 1988c; Lai, 1990). Such hetero- 
geneity is not observed when the virus, e.g., BCV, contains only one 
UCUAA copy in the leader RNA (Hofmann e¢ al., 1993a). 

2. The UCUAA region at the 5’ end of the genomic RNA is a hot spot 
of RNA recombination during mixed infection of MHVs, resulting 
in recombinant MHVs with a crossover site at the 3’ end of the 
leader RNA sequence (Keck et al., 1987). This result is best explained 
by the discontinuous RNA synthesis at the 5’ end of the geno- 
mic RNA. 

3. If the generation of DI RNAs is viewed as an anomaly of RNA 
replication, the structure of naturally occurring DI RNAs reveals an 
insight into the mechanism of RNA replication. Most of the naturally 
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occurring MHV DI RNAs have a copy number of the UCUAA repeat 
different from that of the parental virus, and most lack a nine- 
nucleotide sequence downstream of the UCUAA repeats (Lai et al., 
1987). As discussed above, this is a reflection of discontinuous tran- 
scription in the region. 


The understanding of the mechanism of RNA replication has been 
aided by the use of in vitro-transcribed DI RNA generated from cloned 
cDNA. When DI RNA was transfected into virus-infected cells, the 
leader RNA was rapidly replaced by that of the helper virus (Makino 
and Lai, 1989b; Chang et al., 1996). This leader exchange is dependent 
on the presence of the nine-nucleotide sequence (UUUAUAAAC) in the 
DI RNA (Makino and Lai, 1989b), consistent with the finding that this 
sequence serves as an mRNA start signal for discontinuous transcrip- 
tion (Zhang and Lai, 1996). The use of the cloned DI RNA also allowed 
the determination of the cis-acting signals for RNA replication (Kim 
et al., 1993b; Lin and Lai, 1993). It was shown that more than 400 
nucleotides at both the 5’ and 3’ ends of the DI RNA are required for 
RNA replication, and that some MHV DI RNAs also required a stretch 
(130 nt) of internal sequence in the gene 1 region for RNA replication; 
however, the requirement for the internal sequence was not observed 
in other MHV or BCV DI RNA constructs (Chang e¢ al., 1994; Luytjes 
et al., 1996). Thus, this internal sequence probably plays a role in 
maintaining the overall RNA conformation for some DI RNAs (Y. N. 
Kim and Makino, 1995). Again, the requirement of a 3’ end sequence 
(436 nt) that is longer than that required for negative-strand RNA 
synthesis (55 nt) is a surprise. These 3’ end sequences are probably 
required for positive-strand RNA synthesis during RNA replication. 
This finding is reminiscent of the sequence requirement for RNA tran- 
scription discussed above and suggests that there is a direct or indirect 
RNA-RNA interaction between the 5’ and 3’ ends during RNA replica- 
tion. These DI RNA studies also showed that replication of DI RNA is 
inhibited when an mRNA is transcribed from an IG site within the 
same DI RNA, and that the mechanism of inhibition is due not to 
competition for the same transcription machinery (Jeong and Makino, 
1992), but most likely to the overlap of the cis-acting signals for these 
two different processes. However, the sequence requirements for repli- 
cation and transcription are different, indicating that these two pro- 
cesses are distinguishable. 

The mRNA transcription and genomic RNA replication may be regu- 
lated by the same mechanism throughout most of the viral replication 
cycle. However, the ratio between the genomic RNA and subgenomic 
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RNAs, as detected by radioactive uridine incorporation, increases dur- 
ing the late stages of the BCV replication cycle (Keck et al., 1988a), 
suggesting a possible switching mechanism from transcription to repli- 
cation. It has been shown that genomic RNA replication is coupled to 
encapsidation, since no free genomic RNA is found (Perlman et al., 
1986). Since the encapsidation of RNA requires the N protein, this 
protein may participate in the regulation of switching between tran- 
scription and replication. 


G. Translation of Viral Proteins 
1. Mechanisms of Translation 


The sequences of coronavirus mRNAs usually start from a site imme- 
diately upstream of a gene. These mRNAs, except for the smallest 
mRNA, are structurally polycistronic, containing multiple ORFs. Only 
the 5’ most ORF in the mRNAs is translatable; the remaining ORFs are 
usually functionally silent. Thus, most of these mRNAs are functionally 
monocistronic (see Fig. 7). The S, HE, M, and N proteins, and in most 
coronaviruses the E protein, are translated from separate mRNAs by 
this mechanism; initiation of their translation is unremarkable, utiliz- 
ing a cap-dependent translation mechanism. Many ns proteins, how- 
ever, are translated from truly polycistronic mRNAs, i.e., two or three 
proteins are translated from the same mRNA. For these mRNAs, the 
first ORF, e.g., 3a of IBV or 5a of MHV, is probably also translated by 
the same mechanism as the structural protein genes. For internal 
ORFs, e.g., E protein of IBV and MHV, an alternative mechanism must 
be employed to initiate translation internally. 

One characteristic of coronavirus mRNAs is the presence of the 
leader RNA sequence at the 5’ end, which not only participates in RNA 
transcription, but also regulates the efficiency of translation. It has been 
shown that the presence of the MHV leader sequence on a heterologous 
mRNA in a chimeric RNA construct can enhance its translation in 
virus-infected cell lysates but not in uninfected cell lysates (Tahara et 
al., 1994). This effect conceivably will enable the efficient translation 
of viral mRNAs in the face of shutoff of translation of cellular mRNAs 
in the infected cells (Siddell et al., 1980; Hilton et al., 1986). The mecha- 
nism of translational enhancement by the leader RNA has not been 
determined. It has been shown that during persistent infection of BCV, 
the leader RNA sequence underwent frequent mutations (Hofmann et 
al., 1993b). One of these mutants had an intraleader short ORF and 
a lower translation efficiency, indicating that the leader sequence in- 
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deed can modulate translation. Another region which can potentially 
regulate the translation of coronavirus mRNAs is the 5’ UTR (other 
than the leader sequence) of mRNAs. The genomic RNA (mRNA 1) has 
a particularly long 5’ UTR (200-400 nt). An MHV with a specific point 
mutation within the 5’ UTR was selected during persistent infection 
in vitro (Chen and Baric, 1995). This mutant had a significantly higher 
translation efficiency than the wild-type virus. Different subgenomic 
mRNAs had 5’ UTR of various lengths, which may also affect their 
translation. 

For the translation of internal ORFs, several different mechanisms 
are used by coronaviruses: 

a. Ribosomal Frameshifing Within the Polymerase Gene. All of the 
coronavirus genes 1 (polymerase) sequenced so far contain two overlap- 
ping ORFs. Several features of the IBV polymerase gene sequence 
(Boursnell e¢ al., 1987), coupled with the absence of a distinct mRNA 
for ORF 1b, suggested that translation of ORF 1b involved ribosomal 
frameshifting from ORF 1a, thus synthesizing a large polyprotein con- 
taining both 1a and 1b sequences. Subsequently, a highly efficient (30% 
frequency) —1 frameshift was demonstrated experimentally in vitro 
(Brierley e¢ al., 1987; Somogyi et al., 1993) and in vivo (Brierly et al., 
1990). This mechanism has been shown to operate in gene 1 of MHV, 
HCV-229E, and TGEV as well (Bredenbeek et al., 1990; Lee et al., 
1991; Herold and Siddell, 1993; Eleouet et al., 1995). In all cases, the 
mechanism involves two essential elements: a slippery site followed by 
an RNA pseudoknot (Brierley et al., 1989). The site at which the ribo- 
some slips backward has the sequence UUUAAAC. The pseudoknots of 
IBV and MHV are similar, comprising two base-paired regions stacked 
coaxially in a quasi-continuous manner and connected by two single- 
strand loop regions. The HCV-229E pseudoknot is more complex 
(Herold and Siddell, 1993). It is the overall shape and stability of the 
pseudoknot that are important, not the nucleotide sequence per se. 

Two reasons have been put forward to explain why coronaviruses 
should employ ribosomal frameshifting to translate ORF 1b (Brown 
and Brierly, 1995). One reason is that this is done primarily to control 
the relative amounts of the 1a and 1b products. That could be achieved 
in other ways, of course, e.g., by translating ORF 1b from a separate 
mRNA; this will require that the transcription of la and 1b mRNAs is 
tightly regulated. The other reason may be to avoid making a 1b mRNA. 
Such an mRNA might be packaged into virions in competition with 
genomic RNA, as the RNA region corresponding to the 1b ORF of MHV 
contains a sequence that is essential for packaging into virions (Fosmire 
et al., 1992) (see Section VI,E,1). 
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b. Internal Initiation of Translation of the IBV and MHV E Protein 
mRNA. The E proteins of IBV and MHV are encoded by the third and 
second ORF, respectively, of the corresponding genes 3 and 5 (Fig. 5). 
Cells infected with IBV contain the products of all three of the gene 3 
ORFs (Liu et al., 1991). Both of the MHV gene 5 ORFs are translated 
in vitro (Budzilowicz and Weiss, 1987), but only the 5b ORF product 
has been detected in virus-infected cells (Leibowitz et al., 1988). Experi- 
ments have shown that the E protein ORF of both IBV and MHV 
mRNAs is translated by a cap-independent, internal ribosomal entry 
mechanism (Liu and Inglis, 1992b; Thiel and Siddell, 1994). Further- 
more, if the 3a and 3b ORFs were eliminated from the IBV mRNA, 
translation of the 3c (E) ORF did not occur (Liu and Inglis, 1992b). 
This suggested that the 3a/3b region contains an internal ribosome 
entry site (IRES) for the E protein ORF. Le and colleagues have pre- 
dicted the existence of secondary structures in the 3a/3b region of IBV 
which resemble the IRES elements of picornaviruses (Le et al., 1994). 
They predicted a 265-nucleotide sequence in 3a/3b which would fold 
into five stem-loops, forming a compact structure by the interaction of 
two pseudoknots. 

c. Translation of Nonstructural Proteins. In addition to the ns pro- 
teins encoded from the 5'-most ORFs of mRNAs, several other ns pro- 
teins are encoded from an internal ORF of some viral mRNAs, e.g., 3b 
of IBV and HCV-229E, 4b of BCV, and 7b of FCV (Fig. 5). Most of 
these products have been detected in virus-infected cells; however, the 
mechanism of the internal initiation of translation has not been eluci- 
dated. 

BCV and MHV RNA contains an additional internal ORF within 
the N protein gene. This ORF (termed I) is in a different reading frame 
from that of N protein and encodes a hydrophobic protein (Senanayake 
et al., 1992; Fischer et al., 1997). This protein is translated in virus- 
infected cells by a leaky ribosomal scanning mechanism from the bicis- 
tronic mRNA of N gene (Senanayake et al., 1992). It is a nonessential 
gene. The mechanism of its regulation is not yet clear. 


2. Posttranslational Processing and Modifications 


a. Processing of Pol Proteins 1a and 1b. The gene 1 product is pre- 
dicted to be nearly 700-800 kDa. It is probably processed into multiple 
proteins posttranslationally by its own proteases. The processing path- 
way has just begun to be explored. Remarkably, the protease domains 
and potential cleavage sites predicted by computer analysis (Gorba- 
lenya et al., 1989b; Lee et al., 1991) have largely been confirmed by 
experimental data. Initially, in vitro translation of virion RNA of MHV 
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revealed several polypeptides of more than 200 kDa (Leibowitz et al., 
1982b; Denison and Perlman, 1986). In addition, a 28-kDa product was 
detected and shown to have originated by cleavage from the N terminus 
of a precursor (Denison and Perlman 1986; Soe e¢ al., 1987), now known 
to be the beginning of the ORF 1a polyprotein (Fig. 6). The cleavage 
which generates p28 is carried out by PLP 1 (Fig. 6). It cleaves between 
residues Gly-247 and Val-248, mutation of either residue resulting 
in almost no cleavage (Dong and Baker, 1994; Hughes e¢ al., 1995). 
In addition to p28, the MHV ORF 1a encodes a protein of more than 
400 kDa, which is cleaved to a 290-kDa product, which, in turn, is 
cleaved to produce a 50-kDa and a 240-kDa product (Denison e¢ al., 
1992) (Fig. 6). Another protein of 65 kDa is derived from sequence 
immediately downstream of the p28-encoding region, thus representing 
the N-terminal part of the large polyprotein initially found in in vitro 
translation (probably more than 400 kDa) (Denison ez al., 1995) (Fig. 
6). The cleavage of p65 from the polyprotein was also carried out by 
PLP1 (Bonilla et al., 1995, 1997). Inhibition of the C-terminal cleavage 
of p65 by E64d, an irreversible inhibitor of cysteine (thio) proteinases, 
inhibited MHV replication (Kim et al., 1995). In addition, the 3CLP 
domain is cleaved from the polyprotein by the autocatalytic cleavage 
activity of 3CLP itself to generate a 27-34 kDa protein, which contains 
both the trans- and cis-acting proteolytic activities (Lu et al., 1995, 
1996; Liu and Brown, 1994; Ziebuhr et al., 1995). E64d also inhibited 
the 3CLP protease activity. 

The processing pathway of the 1b protein sequence is less clear. 
There is experimental evidence with IBV and HCV-229E that the 
1b polyprotein is cleaved in trans by the 3CLP encoded by ORF la 
(Liu et al., 1994; Ziebuhr et al., 1995; Grétzinger et al., 1996). A polypep- 
tide of approximately 100 kDa, representing the extreme C terminus 
of ORF 1a and the N terminus of the frame shifted ORF 1b, was 
immunoprecipitated from IBV-infected cells. The cleavage sites of the 
100-kDa protein appear to be at the Q/S sites, as predicted from the 
computer analysis and consistent with the known substrate specificity 
of the picornavirus 3C protease. A similar observation was recently 
made with HCV-229E (Grétzinger et al., 1996). This 100-kDa protein 
contains the putative RNA polymerase motif and thus may represent 
the coronavirus polymerase. The coding region for this protein belongs 
to complementation group D, which has been shown to effect mRNA 
transcription (Fig. 6) (Schaad e¢ al., 1990). 

b. Processing of the Structural Proteins 


1. S protein. The S protein is co-translationally glycosylated with N- 
linked glycans. Conversion of the high mannose (simple) glycans of 
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the S protein to complex ones is a slow process, the half-life being one 
to several hours (Vennema et al., 1990a). The S protein undergoes 
multiple disulfide linkages to fold into a more complex structure 
(Opstelten et al., 1993) and oligomerize into a trimer in the Golgi 
complex (Delmas and Laude, 1990). 

The S prepropolypeptide is converted to a propolypeptide by re- 
moval of the N-terminal signal peptide. Whether the propolypeptide 
is cleaved to generate S1 and S2 depends on the virus species and 
strain and, to some extent, on the cell type in which the virus is grown 
(Frana et al., 1985). The S1-S2 cleavage site in IBV and MHV is 
adjacent to several basic residues (Cavanagh et al., 1986a; Luytjes 
et al., 1987). Those coronaviruses whose S protein is not cleaved, 
e.g., FCV, TGEV, and CCV, have no such pairs of basic residues. 
Cleavage of the MHV S protein occurred after conversion of the 
glycans from simple to complex forms (Vennema e¢ al., 1990a). After 
cleavage, the S1 and S2 subunits are held together by noncovalent 
linkages (Cavanagh e¢ al., 1986b; Sturman et al., 1990). The S2 
protein of MHV is acylated, possibly involving some of the many 
cysteine residues in the C-terminal, hydrophilic tail of S (Schmidt, 
1982; Sturman ef al., 1985; van Berlo e¢ al., 1987). The processing 
of S proteins is reviewed in greater detail by Cavanagh (Cavanagh 
et al., 1995). 

. M protein. Modification of the M protein depends greatly on the 
virus species. The major modification is glycosylation. The oligosac- 
charides of IBV and the TGEV group are of the co-translationally 
added N-linked glycans (Stern and Sefton, 1982b). The conversion 
of the high mannose to complex glycans is not very efficient. In 
contrast, viruses of the MHV group have O-linked glycans which 
are added posttranslationally (Holmes et a/., 1981; Niemann e¢ al., 
1982; 1984; Tooze et al., 1988; Locker et al., 1992a; Krijnse-Locker 
et al., 1994). The M protein of TGEV is also sulfated (Garwes et al., 
1976), but whether this is linked directly to the polypeptide or to 
glycans is unknown. Unlike the M proteins of IBV and the MHV 
group, which have an internal membrane insertion sequence, those 
of the TGEV group have an N-terminal membrane insertion se- 
quence that is absent from the mature M protein (Laude et al., 
1987). This signal sequence, however, is not an essential require- 
ment for the membrane insertion of the M protein (Kapke et ai., 
1988; Vennema ef al., 1991). 

. HE protein. The HE glycoprotein has N-linked glycans which are 
converted to complex ones in the Golgi complex. The N-terminal 
signal sequence is cleaved from the mature protein, which then 
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forms dimers by disulfide bonds (King e¢ al., 1985; Hogue e¢ al., 
1989; Kienzle et al., 1990; Yoo et al., 1992). 

4, E protein. The only known modification for the E protein is acylation 
of MHV E protein (Yu e¢ al., 1994). However, this was not observed 
for the E protein of TGEV when expressed in insect cells (Godet et 
al., 1992). 

5. N protein. The N protein is phosphorylated, the phosphate linkage 
being exclusively to serine residues (Stohlman and Lai, 1979). The 
role of phosphorylation is unknown. 


H. Virus Assembly and Release 


In virus-infected cells, the assembly of virus particles presumably 
starts with the formation of RNP, which interacts with the components 
of viral envelope proteins to form enveloped virus particles and bud 
into the endoplasmic reticulum (ER) and Golgi complex. Several recent 
advances shed light on this process: 


1. Early studies have shown that the S proteins are not necessary 
for virus particle formation; thus, denuded virus particles without 
spikes can be formed in the virus-infected cells treated with tunica- 
mycin, which inhibits N-glycosylation and transport of the S and 
HE proteins (Holmes et al., 1981). Further, recent studies have 
shown that the minimum requirement for the formation of virus- 
like particles (VLP), i.e., empty virus particles, is the M and E 
proteins (Bos et al., 1996; Vennema et al., 1996); 

2. The sites of virus budding are in the ER and Golgi, near the sites 
of accumulation of the M protein (Dubois-Dalcgq et al., 1982; Tooze 
et al., 1984; Tooze and Tooze, 1985; Klumperman e al., 1994); thus, 
the interaction between the M and E proteins appears to be the key 
event for virus particle assembly. The incorporation of the nucleocap- 
sids and S and HE proteins into virus particles may involve subse- 
quent interactions of these components with the M-E complex. 


The virus assembly and release process has been studied in most 
detail for MHV (J. Tooze et al., 1984, 1987; Tooze and Tooze, 1985; 
S. A. Tooze et al., 1988; Krijnse-Locker et al., 1994), and the gross 
features have recently been confirmed for IBV, TGEV, and FIPV 
(Klumperman et al., 1994). Recently, an ultrastructural study of the 
replication of IBV in renal ductotubular epithelial cells of infected 
chicks has also been very informative (Chen and Itakura, 1996). The 
first virions form in the perinuclear region, in small, smooth vesicles/ 
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tubules between the rough ER and the cis side of the Golgi stack. Later, 
the rough ER becomes the major site of virion assembly, extending 
beyond the perinuclear region. Virions then proceed through the Golgi 
complex, at the trans side of which they are collected into vesicles 
of the constitutive exocytic pathway and subsequently released from 
the cell. 

The major determining factor for the site of virus assembly appears 
to be the site of localization of the M protein, which is in the Golgi 
complex. There are some points of difference among the coronaviruses. 
When the M protein of MHV was expressed, it accumulated in the 
trans-Golgi membranes, consistent with its O-linked glycosylation, 
which occurs efficiently (Locker et al., 1992a; Klumperman et al., 1994). 
In contrast, expression of the IBV M protein from cDNA resulted in 
its accumulation in cis-Golgi membranes; consequently the high- 
mannose N-linked glycans of the M protein were not efficiently con- 
verted to complex ones (Machamer et al., 1990; Klumperman et ai., 
1994), in agreement with the properties of the M protein in the IBV 
virions. Glycosylation of the coronavirus M protein is not essential for 
its translocation or for virus particle formation. The M protein exists 
as monomers in the ER, but it oligomerizes to form variously sized 
complexes during transport through the Golgi and trans-Golgi network 
(Locker et al., 1995). It is likely that the M molecules in the virus 
particles are in complexed form. 

The sequence requirements for insertion of the nascent M polypep- 
tide into the rough ER have not been precisely defined. With the excep- 
tion of the TGEV group, the coronavirus M proteins do not have an 
amino-terminal signal peptide. Even in the case of the TGEV group, 
the signal peptide is not essential for membrane insertion of the M 
protein (Kapke et al., 1988; Vennema et al., 1991). Rather, one of the 
three transmembrane sequences of the M protein is responsible for the 
insertion of M into the ER and its final localization in the Golgi complex 
(Machamer and Rose, 1987; Mayer et al., 1988; Armstrong et al., 1990; 
Locker et al., 1992b). Different domains of the M protein of IBV and 
MHV have been identified as the sequences responsible for the final 
localization of the protein. The first membrane-spanning domain of the 
IBV M protein performs this function, the M protein being concentrated 
in the cis-Golgi membranes (Machamer and Rose, 1987; Machamer et 
al., 1990, 1993; Swift and Machamer, 1991). In contrast, the carboxy- 
terminal domain of the MHV M protein, probably in combination with 
a middle domain, directs the protein to the trans-Golgi (Armstrong and 
Patel, 1991; Weisz et al., 1993; Krijnse-Locker e¢ al., 1994). It should be 
borne in mind, however, that the major site of virus particle formation is 
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proximal to either of the Golgi compartments, namely, in an intermedi- 
ate compartment between the ER and the Golgi complex (Klumperman 
et al., 1994). Thus, it is proximal to the major site of M accumulation. 
What is responsible for that? The answer would appear to be that 
the S glycoprotein and the nucleocapsid interact with the M protein 
molecules before the M proteins have migrated to the Golgi, precipitat- 
ing virus particle formation. 

It has been shown that the coronavirus M protein can interact with 
nucleocapsids (Sturman et al., 1980). This interaction requires the pres- 
ence of viral RNA, since the N protein alone cannot be incorporated 
into the VLPs (Bos e¢ al., 1996; Vennema et al., 1996), suggesting either 
that M interacts with viral RNA, or that RNA-N protein binding in- 
duces a conformational change in the N protein, enabling it to interact 
with M. Interaction between the M and S proteins has also been demon- 
strated. The M and S proteins co-sediment under certain ionic condi- 
tions after dissolution of virions with mild detergents (Cavanagh, 
1983b), and cell-associated complexes containing M and S have been 
detected (Opstelten et al., 1995). The S protein undergoes certain confor- 
mational changes induced by disulfide linkage before it is able to inter- 
act with M (Opstelten et al., 1993, 1995). Inhibition of correct oligomer- 
ization of S by dithiothreitol prevented interaction of S with M and, asa 
result, the rate of transport of the M protein to the trans-Golgi increased 
(Opstelten et al., 1993). This result suggests that S-M interaction can 
retard the transport of the M protein. The ability of the S or HE protein 
to interact with the M protein appears to be a prerequisite for their 
incorporation into virus particles. In this regard, it is interesting to 
note that MHV ts mutants with a deletion in the ectodomain of the S 
protein or those with defects in oligomerization of the S protein do not 
incorporate the S protein (Ricard et al., 1995; Luytjes et al., 1997). 
Also, partial deletions in the ectodomain of the HE protein prevent 
its incorporation into virus particles (Liao et al., 1995). These results 
suggest that the interaction of S or HE with M occurs through the 
ectodomain or requires the correct protein conformation in the ectodo- 
main. The formation of the S-M complex occurs in the pre-Golgi com- 
plex, whereas the S-M complex progresses until the Golgi complex, 
indicating that this interaction is not sufficient to localize it in the pre- 
Golgi complex, the ultimate site of virion budding (Opstelten et al., 
1995). Thus, M-nucleocapsid interaction may also contribute to the 
determination of the site of virus assembly. In this regard, it is impor- 
tant to note that the recent discovery that M is present in the viral 
RNP core, as well as in the envelope (Risco et al., 1996) may further 
indicate the crucial role of the M protein in the virus assembly process. 
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Only the M and E proteins are required for the production of VLPs 
(Bos e¢ al., 1996; Vennema et al., 1996). These particles were formed 
when the M and E proteins were expressed from transfected plasmids. 
S protein was incorporated into the VLPs if expressed. In the absence 
of viral RNA, the N protein also was not incorporated. When all the 
structural proteins were expressed from plasmids in the presence of 
an MHV DIRNA, which contains a packaging signal, and in the absence 
of helper virus, the VLPs contained the DI RNA (Bos e¢ al., 1996). 
Moreover, these VLP were “infectious,” i.e., on transfer of the released 
VLPs to a new cell culture, they were able to infect the cells, as revealed 
by the rescue of the DI RNA by helper virus. These results show that 
N is dispensable for the formation of VLPs but the packaging of RNA 
into virion requires an interaction between M and the N-containing 
ribonucleoprotein, as previously demonstrated (Sturman et al., 1980). 
The expression of the M protein alone in the cells did not lead to VLP 
formation or induction of curvature in the M-containing intracellular 
membranes. The presence of the E protein together with the M protein 
triggered both events, but the ratio of M:E in virions was as high as 
100:1 (Vennema ef al., 1996). This has led to the suggestion that E 
does not have frequent, regular positions in the lattice formed by M but 
rather occupies strategic positions within the lattice to cause membrane 
curvature. Alternatively, its role may be to close the neck of the virus 
particle as it pinches off from the membrane in the final stage of 
budding. 

What determines the site of virion budding? It is possible that the 
E protein dictates the site of budding, since this protein is also localized 
in the perinuclear region and associated with membrane (Godet et al., 
1992; Yu et al., 1994). Alternatively, it may be the interaction of the 
RNP-nucleocapsid with the S-M complexes which halts the migration 
of the latter and promotes budding. Relevant to this notion is the 
observation that the nucleocapsids and free N protein have affinity for 
membranes (Anderson and Wong, 1993). It should be remembered, 
however, that in the absence of S, HE, and nucleocapsids, the E and 
M proteins alone can induce budding to form VLPs (Bos e¢ al., 1996; 
Vennema et al., 1996). It is not yet clear whether the budding site of 
VLP containing only M and E is the same as that for the complete 
virion. Empty virus particles have previously been isolated from IBV, 
which were grown in embryonated fowl eggs (Macnaughton and Davies, 
1980). This supports the view that even during natural infection, virus 
budding can be induced without involvement of the viral nucleocapsid. 
Parallels have been drawn between the E protein of coronaviruses, the 
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MQ protein of orthomyxoviruses, and the 6K protein of alphaviruses. 
All are minor envelope proteins that play a role in virus assembly. 

Once the virus particles bud into the pre-Golgi compartment, they 
are transported through the Golgi complex. Whether the Golgi- 
associated posttranslational modifications occur before or after incorpo- 
ration of the proteins into virus particles is not known. Retrograde 
transport of the proteins may be required for some steps of the virus 
assembly process. Finally, the release of virus particles from the cells 
appears to be restricted to certain areas of cells. TGEV grown in polar- 
ized LLC-PK1 cells both enter and exit by the apical surface (Rossen 
et al., 1994), whereas MHV-A59 enters polarized murine kidney cells 
(mTAL) by the apical surface but is released via the basolateral surface 
(Rossen et al., 1995a). However, the site of virus release varies with 
different cell lines (Rossen et al., 1997). The factors governing this 
process are not known (Rossen et al., 1995b). 


VI. GENETICS OF CORONAVIRUSES 


Probably because of the large size of their RNA genomes, coronavi- 
ruses have developed a variety of genetic mechanisms, among which 
are RNA recombination and generation of DI RNA, to maintain their 
genetic stability and, as a side product, generate diversity. Coronavi- 
ruses also readily undergo genetic mutation, a characteristic common 
to all RNA viruses. Thus, they evolve rapidly and are heterogeneous. 
These genetic phenomena provide virologists with useful tools for un- 
derstanding coronavirus biology, particularly because reverse genetics 
studies for coronaviruses are not yet feasible. 


A. Natural Virus Variants and Mutants 
1. Temperature-Sensitive Mutants 


Using a variety of chemical mutagens, several laboratories have 
isolated MHV temperature-sensitive (ts) mutants which cannot pro- 
duce infectious virus particles or cause different plaque morphology at 
the nonpermissive temperature (Haspel et al., 1978; Robb et al., 1979; 
Wege et al., 1981; Koolen et al., 1983; Schaad et al., 1990). Some of 
these mutants have been characterized with respect to their ability to 
synthesize RNA and have been grouped into at least seven complemen- 
tation groups (Leibowitz et al., 1982a), five of which have the RNA (—) 
phenotype (i.e., cannot synthesize RNA at the nonpermissive tempera- 
ture) (Leibowitz et al., 1982a; Schaad et al., 1990) (see Fig. 6). With the 
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use of recombination analysis (see below), the possible genetic defects of 
the mutants were mapped on the RNA genome (Baric e¢ al., 1990; Fu 
and Baric, 1994). It appears that all of the RNA (—) mutants have 
genetic defects within gene 1, suggesting that gene 1 encodes RNA 
polymerase and other proteins involved directly or indirectly in viral 
RNA synthesis. The genetic defects of some of these mutants have 
been confirmed by RNA sequence analysis of the mutants and their 
revertants (Fu and Baric, 1994). These five different complementation 
groups have been demonstrated to affect different steps of RNA synthe- 
sis, including the synthesis of leader RNA, negative-strand RNA, and 
positive-strand RNA (Fig. 6), suggesting that different steps of RNA 
synthesis require different viral proteins (Baric et al., 1990; Schaad et 
al., 1990). It is still not possible, however, to correlate the genetic 
defects definitively with the known processed products of the gene 
1 polyprotein. 

Among the RNA (+) mutants, two complementation groups have 
been assigned to the gene encoding the S protein (Baric et al., 1990; 
Fu and Baric, 1994), but the phenotype of these mutants has not been 
well characterized. Another RNA (+) mutant, Alb 18, has a single 
amino acid substitution in the N-terminal domain of S protein and 
cannot incorporate S protein into the virus particles (Ricard et al., 
1995). Still another group of RNA (+) mutants have a defective N 
protein (Koetzner et al., 1992; Masters et al., 1994; Peng et al., 1995a) 
and produce smaller plaques at the nonpermissive temperature; several 
of these mutants have a deletion in the N gene (Masters, 1992) and 
are defective in RNA-binding activity (Peng et al., 1995a). Most wild- 
type revertants have a second-site mutation in the N protein and re- 
stored RNA-binding activity (Peng et al., 1995a). 


2. Neutralization-Escape Mutants 


Another class of viral mutants was obtained by a specific selection 
scheme, e.g., by treating viruses with neutralizing MAb and selecting 
mutant viruses resistant to neutralization. Since neutralizing antibod- 
ies are usually directed against the S protein, all of the neutralization- 
escape mutants were presumed to have defects in the S gene. This was 
indeed the case (reviewed by Cavanagh et al., 1995). Depending on the 
neutralizing MAb used for selection, the mutants obtained had either 
deletions or point mutations in the neutralization epitopes of the S 
protein (Gallagher et al., 1990; Wang ef al., 1992). These mutants 
generally retain growth properties very similar to those of the parental 
virus but often have significantly different pathogenic properties with 
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altered tissue tropism (Dalziel et al., 1986; Fleming e¢ al., 1986, 1987; 
Wege et al., 1988). 


3. Other Nonconditional Deletions or Substitutions 


During serial virus passages in tissue culture or in animals, coronavi- 
ruses often undergo various deletions or substitutions even in the ab- 
sence of experimentally applied selection pressure. These genetic 
changes probably provide the emerging virus variants with evolution- 
ary advantages under experimental conditions or in natural infection. 
The deletions occur most frequently within the S gene, particularly 
within a hypervariable region encoding the S1 subunit (S. E. Parker 
et al., 1989; Wang et al., 1992). In fact, some natural isolates of MHV 
have a deletion of 150-460 nucleotides in this region (Fig. 4). Similar 
deletions have been detected in virus variants during central nervous 
system (CNS) infections of rats (La Monica et al., 1991). In persistent 
infections of cultured cells of CNS origin, viruses with point mutations 
or deletions in the gene encoding S protein are frequently selected 
(Gallagher et al., 1991; Gombold et al., 1993; Rowe et al., 1997). These 
viruses often have altered cell fusion and pathogenic properties. 

The most striking effect of deletions during natural virus infection 
is illustrated by the emergence of PRCV from TGEV. TGEV causes 
epizootic enteric infection in pigs, resulting in a very high mortality 
rate in newborn pigs. An attenuated virus strain that is related to 
TGEV but infects only respiratory tissues was isolated in Western 
Europe in the early 1980s (Pensaert et al., 1986). An independent isolate 
of PRCV was subsequently obtained in the United States (Wesley et 
al., 1990). Both of these PRCV isolates have similar extents of deletion 
in the N terminus of the S1 protein, in addition to smaller deletions 
in gene 3, which eliminates its expression (Rasschaert et al., 1990; 
Wesley et al., 1991; Laude, 1993). Although it is not yet possible to link 
the changes in viral pathogenicity to the deletions in the S gene or 
gene 3, the TGEV-PRCV evolution illustrates the power of deletions 
in coronavirus evolution. 


B. Complementation 


Different ts mutants with defects in different coronaviral genes have 
been demonstrated to complement each other. The available ts mutants 
of MHV have been divided into at least seven complementation groups, 
five of which have an RNA (—) phenotype (Leibowitz et al., 1982a) (Fig. 
6). It is worth noting that these five RNA (—) complementation groups 
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have been mapped in gene 1 (Baric et al., 1990), which is translated 
into a polyprotein. The existence of five complementation groups within 
this gene indicates that this polyprotein is processed into at least five 
different proteins that function independently. It is not possible, how- 
ever, to complement the genetic defects of a virus by expressing a wild- 
type viral protein from an exogenous vector. 


C. Phenotypic Mixing and Pseudotype Virus Formation 


Mixed infection with MHV and murine leukemia virus in tissue 
culture cells yielded a pseudotype MHV which contained a murine 
leukemia virus envelope protein and was neutralized by antibodies 
against both murine leukemia virus and MHV (Yoshikura and Taguchi, 
1978). This phenotypic mixing of viral proteins suggests the lack of a 
stringent requirement for a virus-specific spike protein for the forma- 
tion of coronavirus particles. Pseudotype formation of virus particles 
has also been achieved by expressing a viral protein, e.g., HE protein, 
from a DI RNA vector (see Section VI, E), which was incorporated into 
virus particles (Liao et al., 1995). 


D. RNA Recombination 


One unique genetic feature of coronaviruses is their ability to un- 
dergo RNA recombination at a very high frequency; this is particularly 
true of MHV, in which recombinant viruses containing parts of the 
genomic sequences of both parental viruses could be isolated at high 
frequency when two strains of MHV with defined genetic markers were 
co-infected into culture cells or animals. This genetic phenomenon was 
first discovered using two ts mutants of MHV (Lai et al., 1985). Subse- 
quently, many different recombinant MHVs were isolated (Keck e¢ al., 
1987, 1988b,c; Makino et al., 1987) using a combination of selection 
markers, such as ts markers, resistance to neutralizing antibodies, and 
cytopathic effects (the ability of the virus to cause fusion). Based on 
the distribution of the crossover sites on the viral RNA genome, it 
appears that recombination can occur practically anywhere on the viral 
genome, although some combinations of virus strains favor selection 
of viruses with certain recombination sites (Lai, 1992). For example, 
between the MHV A59 and JHM strains, recombination occurs mostly 
at the 5’ end of the genome and rarely at the 3’ end. In contrast, 
recombination between the MHV-2 and JHM strains occurs readily at 
the 3’ end (Keck et al., 1988c). The most surprising finding with regard 
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to MHV recombination is the extremely high frequency of recombina- 
tion, which has been calculated to be nearly 25% for the entire MHV 
genome (Baric ef al., 1990). This high frequency of recombination is 
reminiscent of the reassortment of segmented RNA genomes in viruses 
such as influenza virus and reovirus. The recombination map for MHV 
is nearly linear, suggesting the random occurrence of recombination 
(Baric et al., 1990); however, more careful analysis of the recombination 
frequency showed that there is an increasing gradient of recombination 
frequency (in the direction of 5’—3') across the genome, suggesting 
that subgenomic mRNAs, which represent preferentially the 3’ end 
sequences, may participate in RNA recombination (Fu and Baric, 1992, 
1994). Recombination has now been demonstrated experimentally for 
IBV (Kotier ez al., 1995) and TGEV (Ballesteros et al., 1997) in embryo- 
nated eggs or tissue culture; however, the recombination frequency for 
these viruses has not been determined. 

Recombination can provide a powerful tool for virus evolution. For 
example, in a study in which ts mutants of the A59 strain of MHV 
were co-infected with a wild-type JHM strain, the majority of the prog- 
eny viruses after a single passage were recombinants which contained 
the 5’ end of the A59 genome (Makino et al., 1986a), suggesting that 
this recombinant virus has evolutionary advantages. Recombination 
has also been demonstrated during virus infection in animals (Keck et 
al., 1988b). 

Similar to the situation in other RNA viruses, coronavirus recombi- 
nation probably occurs by a copy-choice mechanism (Lai, 1992). It has 
been shown that MHV RNA synthesis normally pauses at certain sites 
on the RNA genome (Baric e¢ al., 1987). The nascent, incomplete RNA 
transcripts may dissociate from the template RNA and then rebind to 
the template to resume RNA synthesis. When the nascent RNA binds 
to a different template, the resumed RNA synthesis will result in a 
recombinant RNA. Whether coronavirus recombination occurs more 
frequently at certain RNA sites with more complex secondary structure 
is not yet known. When RNA recombination was examined under nonse- 
lective conditions (by reverse transcription-polymerase chain reaction 
detection of the intracellular RNA from virus-infected cells), recombina- 
tion sites appeared to be random; only after serial passages did “hot 
spots” of RNA recombination become apparent (Banner and Lai, 1991). 
This finding indicates that the recombination hot spots may be the 
result of selection. 

Recombination has been detected during natural infections of corona- 
viruses, most notably IBV. Sequence analysis of natural IBV strains has 
provided convincing evidence that some IBV strains are recombinants 


MOLECULAR BIOLOGY OF CORONAVIRUSES 71 


between different IBV strains; recombination sites have been detected 
so far in the 5’ half of the S gene and at the 3’ end of viral RNA (Kusters 
et al., 1989; Cavanagh and Davis, 1992; Wang et al., 1993, 1994; Jia 
et al., 1995). Thus, recombination is a natural evolutionary strategy 
for coronaviruses. 

RNA recombination may also explain the difference in genome struc- 
ture among different coronaviruses. For example, IBV contains an addi- 
tional gene, gene 5 (a nonstructural protein gene) inserted between 
gene M and gene N (Fig. 5). This insertion could be the result of a 
recombination mechanism involving the consensus IG sequence, which 
provides a favored recombination site. Since all of the coronavirus genes 
are flanked by consensus IG sequences, each gene can be considered a 
gene “cassette,” which can be rearranged by homologous recombination 
involving the consensus IG sequence. A nonhomologous recombination 
event between coronavirus RNAs and other virus or cellular RNAs may 
also explain the gene insertions in some coronaviruses. For example, 
MHV and BCV contain an additional gene, HE, which is similar in 
sequence to the HE gene of influenza C virus (Luytjes et al., 1988). 
This gene may have been derived by recombination between a coronavi- 
rus and influenza C virus. Comparison between genome structures of 
coronavirus and torovirus also suggests that several recombination 
events may have been involved in rearranging the order of several 
genes during the evolution of these viruses (Snijder et al., 1991). 

Recombination has been demonstrated to occur between viral 
RNA and a transfected RNA fragment derived from the viral genome 
(Koetzner et al., 1992; Liao and Lai, 1992). Since transfection of both 
the positive- and negative-strand RNA fragments led to recombination, 
these results suggested that recombination can occur during both 
positive- and negative-strand RNA synthesis (Liao and Lai, 1992). Re- 
combination can also take place between DI RNAs and viral RNA 
reciprocally, i.e., the viral RNA sequence can be incorporated into DI 
RNA, and vice versa, during viral RNA replication. The incorporation 
of a helper viral RNA sequence into DI RNA accounts at least partially 
for the continuous evolution of MHV DI RNA species during serial 
passages in cultured cells (Furuya et al., 1993) (see the next section). 
This phenomenon also explains why some genetic markers in the DI 
RNA were rapidly replaced by the helper viral RNA sequences during 
DI RNA replication (de Groot et al., 1992; Kim et al., 1993a). On the 
other hand, the incorporation of DI RNA sequences into viral RNA by 
recombination provides an important tool to introduce desired se- 
quences into the viral genome. For example, when an mRNA 7 or DI 
RNA containing the N gene of MHV was transfected into cells infected 
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with an MHV ts mutant containing a defective N protein, recombination 
occurred between the DI RNA and the wild-type viral RNA, resulting 
in recombinant viruses which had a wild-type RNA sequence derived 
from the transfected RNA in place of the defective N gene (Koetzner 
et al., 1992; van der Most et al., 1992; Masters et al., 1994; Peng et al., 
1995a). An MHV recombinant containing a chimeric N protein of BCV 
and MHV has also been derived by this RNA recombination strategy 
(Peng et al., 1995b). This targeted RNA recombination promises to be 
a powerful tool. 

Recombination is thus one of the most unique aspects of coronavirus 
biology. It can potentially provide a genetic mechanism by which coro- 
naviruses maintain their sequence integrity. In view of the large size 
of the coronavirus RNA, it is predictable that most of the viral RNA 
molecules would contain mutations due to the high error frequencies 
of RNA polymerases; recombination may provide a repair mechanism 
for the virus (Lai, 1992). 


E. Defective-Interfering (DI) RNAs 


Similar to most RNA viruses, coronaviruses can readily generate 
DI particles when viruses are passaged in tissue culture at a high 
multiplicity of infection. This has been demonstrated for MHV, IBV, 
and TGEV. When MHV was serially passaged, different types of DI 
RNA appeared at different passage levels, suggesting that DI RNAs 
continue to evolve and that new DIs have a selective advantage under 
the evolving cellular conditions (Makino et al., 1985). However, the 
IBV and TGEV DIs appear to be more stable (Penzes et al., 1994, 
Mendez et al., 1996). The generation of DI RNAs is probably caused 
by polymerase jumping during RNA replication or nonhomologous RNA 
recombination. Although no sequence homology exists at the fusion 
sites of different RNA regions within the DI RNA, a high degree of 
potential secondary structure does exist at some of its RNA fusion sites 
(Makino et al., 1988b), which may have facilitated the pausing and 
template switching of RNA polymerase during synthesis. If nonhomolo- 
gous recombination is involved in generating DI RNA, it probably occurs 
between two different RNA molecules because DI RNAs are generated 
only at high multiplicity of infection. Recombination between an exist- 
ing DI RNA and helper virus RNA has been shown to contribute to the 
evolution of MHV DI RNAs during virus passages (Furuya et al., 1993). 

The coronavirus DI RNAs can be grouped into three types. The first 
type is of nearly genomic size and is typified by DIssA RNA of MHV 
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(Makino e¢ al., 1985). This DI RNA is efficiently packaged into virus 
particles and contains several deletions in the viral genome, but it 
contains a functional gene 1, which encodes RNA polymerase, and a 
functional gene 7, which encodes N protein. These two functional gene 
products are sufficient to support DI RNA replication (K. H. Kim and 
Makino, 1995); thus, this type of DI RNA can replicate without a helper 
virus (Makino et al., 1988a; K. H. Kim and Makino, 1995). By definition, 
it is not a DI RNA, inasmuch as it is not defective in replication; 
however, because it is smaller than the genomic RNA and is produced 
at a high multiplicity of infection, it is classified as a DI RNA. This 
type of DI is unique to coronavirus. A 22-kb DI RNA has been described 
for TGEV (Mendez et al., 1996), but whether it can replicate in the 
absence of a helper virus has not been examined. 

The second type DI RNA is typified by DIssE of MHV (Makino et 
al., 1988b). This DI RNA is truly defective and can replicate only in 
the presence of helper viruses. It replicates very efficiently, but is poorly 
packaged into virus particles because it lacks a specific RNA-packaging 
signal. This type of DI RNA typically contains both the 5’ and 3’ ends 
of the wild-type viral RNA and one or several discontiguous regions of 
the wild-type RNAs. Because of the high efficiency of replication, this 
type of DI can still be serially passaged in tissue culture for at least 
several passages, probably because a small amount of DI RNA can be 
nonspecifically packaged into the virion. 

The third type of DI RNA is represented by DIssF of MHV-JHM 
(Makino et al., 1990) and DI-a of MHV A59 (van der Most et al., 1991). 
It is similar to the second type but contains an RNA-packaging signal 
and is thus packaged efficiently into virus particles. This type of DI 
RNA has been detected in IBV (Penzes ef al., 1994) and TGEV (Mendez 
et al., 1996). A small DI RNA (2.2 kb) of BCV may also belong to 
this type (Chang and Brian, 1996), but whether this DI RNA can be 
specifically packaged into virion is not certain. 

All three types of DI RNAs contain an ORF, which encodes a protein 
fused from two different viral proteins. This ORF is not required for 
the replication of MHV DI RNA (Liao and Lai, 1995); nevertheless, 
MHV DI RNAs with a functional ORF usually have an evolutionary 
advantage over those without one or with a smaller ORF (de Groot et 
al., 1992; Kim e¢ al., 1993a). Therefore, a DI RNA containing a short 
ORF was often rapidly replaced by DI RNAs containing a longer ORF 
that had been generated by recombination or mutation (de Groot et al., 
1992; Kim et al., 1993a). The translatability of the ORF may be more 
important than the nature of the actual protein translated from this 
ORF (van der Most et al., 1995), suggesting that translation of RNA 
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may facilitate RNA replication. Reduction of the ORF of an IBV DI 
RNA to just 20 amino acids did not diminish its capacity to be replicated 
or packaged (Penzes et al., 1996). However, it has been shown for a 
BCV DI RNA that a BCV-specific N protein translated from the DI 
ORF (a cis-acting protein) is required for efficient DI RNA replication 
(Chang and Brian, 1996). The variation in the sequence requirement 
for RNA replication of these DI RNAs may be related to their overall 
RNA conformation. The significance of DI RNA in the biology and 
natural evolution of coronaviruses is not known. 

DI RNAs provide useful tools for studying the sequence and struc- 
tural requirements for various functions of viral genomic RNA. As they 
contain cis-acting signals for RNA replication, they are mini-versions 
of the viral genomic RNA. However, it should be cautioned again that 
because of the small size of the DI RNA compared to the genomic RNA, 
the structural requirements for various RNA functions, as determined 
from the use of DI RNA constructs, may be different from those of the 
whole viral genome. 

The following cis-acting signals for various RNA functions have been 
determined using various DI RNAs: 


1. RNA-packaging signal. In a comparison of MHV DI RNAs that are 
efficiently and inefficiently packaged, it was determined that the 
packaging signal for MHV DI RNA is localized near the 3’ end of 
gene 1 (in the 1b region, approximately 20 kb from the 5’ end) 
(Makino et al., 1990; van der Most e¢ al., 1991; Fosmire et al., 1992). 
This packaging signal forms a stem-loop structure which may be 
required for the RNA-packaging activity (Fosmire et al., 1992). It 
is necessary and sufficient for the packaging of DI RNA or a heterolo- 
gous RNA into the virions (Woo ef al., 1997). The fact that this 
packaging signal is localized in gene 1, which is present in genomic 
but not subgenomic RNAs, is consistent with the packaging of geno- 
mic but not subgenomic RNAs in virus particles. The packaging 
signal for DI RNAs of other coronaviruses has not been determined. 
However, some coronaviruses have been shown to package subgeno- 
mic mRNAs at low efficiency (Sethna et al., 1989; Hofmann et al., 
1990; Zhao et al., 1993). These are probably packaged nonspecifi- 
cally; however, the possibility that these viruses may have a different 
RNA packaging signal cannot yet be ruled out. Similarly, DI RNAs 
that do not contain this packaging signal, such as DIssE RNA of 
MHV (Makino et al., 1988b) and DI RNA of BCV (Chang and Brian, 
1996; Chang e¢ al., 1996), can be packaged at low efficiency, thereby 
maintaining themselves for at least several passages in tissue 
culture. 
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2. Negative-strand RNA synthesis. For MHV DI RNA, it has been 
shown that only 55 nucleotides at the 3’ end plus a stretch of poly(A) 
sequence are required for negative-strand RNA synthesis (Lin et al., 
1994); no specific upstream RNA sequences are required. However, 
when an mRNA is transcribed from an IG site in the same DI RNA, 
the negative-strand RNA synthesis from this DI RNA is inhibited, 
suggesting a common element involved in mRNA transcription and 
negative-strand RNA synthesis (Lin et al., 1994). One unanswered 
question is whether or not the sequence requirements for the synthe- 
sis of genomic and subgenomic negative-strand RNA are identical. 

3. Replication signal. Sequential deletion analysis has shown that the 
replication (i.e., complete cycles of negative- and positive-strand 
RNA synthesis) of MHV DIssE or DIssF RNAs requires approxi- 
mately 400-800 nucleotides from both the 5’ and 3’ ends. The mini- 
mum sequence requirement for RNA replication may vary with dif- 
ferent DI RNAs. These issues have been discussed in Section V,F. 

4. Transcriptional signal. DI RNAs normally do not transcribe subgen- 
omic mRNAs because they do not have IG sequences. Thus, natural 
DI RNAs can synthesize only the full-sized DI RNA. However, by 
introduction of the consensus IG sequences into DI RNA (Makino 
et al., 1991), it has been possible to use DI RNA as a vector for 
determining the sequence requirement for subgenomic RNA tran- 
scription. The cis- and trans-acting signals for transcription have 
been described in Section V,E. 

5. Recombination. DI RNAs of MHV have been demonstrated to un- 
dergo a high frequency of recombination with helper virus RNA. As 
discussed above, this accounts for the evolution of MHV DI RNA 
species during serial passages of viruses (Furuya et al., 1993). Fur- 
thermore, MHV DI RNAs with a smaller ORF are frequently re- 
placed by a DI RNA with a larger ORF by recombination with the 
helper virus RNA (de Groot et al., 1992; Kim et al., 1993a), suggesting 
that recombination between DI RNAs and helper virus RNAs occurs 
readily. The reciprocal recombination between DI RNA and helper 
virus RNA, i.e., the transfer of DI RNA sequences to the helper virus 
RNA, also has been observed. As a result, the genetic markers on 
the DI RNA can be incorporated into the helper virus RNA (Koetzner 
et al., 1992). Recombination between two DI RNAs, however, has 
not been described. Sequence requirements for RNA recombination 
also have not been studied. BCV DI RNAs also undergo frequent 
recombination (Chang et al., 1996). However, DI RNAs of IBV and 
TGEV appear to be more stable. 
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VII. PERSPECTIVES 


Coronavirus research has made tremendous progress in the last 
decade. The virus family has grown in size, and many of the features 
thought to be unique to coronaviruses have now been found to be 
shared by some other viruses. Since the last time this serial publication 
published the first comprehensive review of the molecular biology of 
coronaviruses (Sturman and Holmes, 1983), the literature on this virus 
has grown to exceed anyone’s ability to do a comprehensive review of 
every topic relating to coronaviruses. In this review, we have concen- 
trated on areas which have shown the most progress and which present 
the most challenges. Our choice of literature was meant to be represen- 
tative but is by no means comprehensive. Notably missing from this 
review are the molecular studies related to viral pathogenesis and the 
interactions between the virus and cells. 

Coronavirus research has contributed to the understanding of many 
aspects of molecular biology in general, such as the mechanism of RNA 
synthesis, translational control, and protein transport and processing. 
It remains a treasure capable of generating unexpected insights. De- 
spite two decades of studies on the molecular biology of this virus, there 
are still many problems to be solved: 


1. With regard to the mechanism of RNA transcription, many conflict- 
ing data remain. Coronavirus undoubtedly utilizes a unique, discon- 
tinuous transcription mechanism, but how it acts is a subject of 
debate. An in vitro RNA transcription system, so necessary for an 
understanding of RNA synthesis, is still in its infancy. Related to 
this question is the nature of RNA polymerase. The sheer size of 
the polymerase gene presents a daunting task. The availability of 
the cDNA clones and expression vectors for this gene has just begun 
to allow this black box to be cracked open. This will undoubtedly be 
a fruitful area of future research. 

2. The last two years have seen the unraveling of the mechanism 
of coronavirus assembly, which, as it turns out, involves a little- 
characterized E protein. How the various viral structural proteins 
interact with each other in the various subcellular compartments 
to form a complete virus particle is an exciting frontier. 

3. After more than 30 years since the first coronavirus was seen under 
electron microscope, an unexpected new feature of the virus, namely, 
an icosahedral core with a helical nucleoprotein, was recently uncov- 
ered. This structure places coronavirus in a unique position among 
RNA viruses because it takes on the characteristics of positive-, 
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negative- and double-strand RNA viruses in morphology. This recent 
finding challenges us to reevaluate the structure of coronaviruses. 

4. The ability to perform reverse genetic studies of coronavirus is still 
very limited. Expression of individual viral genes and targeted re- 
combination of very limited RNA regions are the only available 
genetic means for examining the structure and function of the coro- 
navirus genome. Perhaps it is an unrealistic dream, but progress 
in polymerase chain reaction technology may one day allow an infec- 
tious cDNA for coronavirus RNA to be made. 

5. The early events of viral replication have so far been largely ignored. 
Identification of the cellular receptors for the viruses may finally 
provide penetrating molecular tools to allow these issues to be exam- 
ined. It will not be a surprise to discover that virus penetration and 
uncoating play defining roles in the cellular tropism of viruses. 

6. Are nonstructural protein genes really unnecessary? Even if they 
are auxiliary genes, they may prove to play significant roles in the 
biology of the virus. 

7. Finally, what of the potential interaction between the virus and 
host, which has been one of the major themes of virology in recent 
years? It may be a little premature to conclude that cellular factors 
play major roles in coronavirus replication, but there is little doubt 
that cells are playing more active roles than was previously sus- 
pected. Is the nucleus contributing to the coronavirus replication? 
This may require reexamination. 


These are but some of the exciting challenges for the coronavirolo- 
gists to tackle. The next decade should bring us an even better under- 
standing of the various aspects of the molecular biology of coronavi- 
ruses. 
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